ANT: Adaptive Neural Temporal-Aware Text-to-Motion Model

Published in arXiv preprint, 2024

Abstract

This paper proposes ANT (Adaptive Neural Temporal-Aware Text-to-Motion Model), which significantly improves semantic alignment and generation efficiency in text-to-motion tasks. By introducing semantic temporal-aware modules and Dynamic Classifier-Free Guidance (DCFG) strategy, our model better understands and generates temporally and semantically coherent motions.

Key Contributions

  • Temporal Awareness: Novel semantic temporal-aware module design
  • Dynamic Guidance: DCFG strategy for improved generation control
  • Enhanced Alignment: Better semantic alignment between text and motion
  • Improved Efficiency: Faster generation with maintained quality

Technical Highlights

  • Temporal-aware neural architecture
  • Dynamic classifier-free guidance mechanism
  • Improved semantic understanding for motion generation
  • Efficient training and inference pipeline

Download paper here

Recommended citation: Jia, H. et al. (2024). "ANT: Adaptive Neural Temporal-Aware Text-to-Motion Model." arXiv preprint arXiv:2506.02452. https://arxiv.org/abs/2506.02452