Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency Loss
Published in arXiv preprint, 2025
Abstract
This paper proposes Free-T2M, a frequency enhanced text-to-motion diffusion model that significantly improves the quality and stability of generated motions. By introducing a consistency loss function and enhancing frequency information, our model generates more natural and text-aligned human motions.
Key Contributions
- Frequency Enhancement: Novel approach to enhance frequency information in motion generation
- Consistency Loss: Improved training stability through carefully designed consistency loss
- Superior Performance: Demonstrates improved motion quality and text-motion alignment
- Robust Generation: Enhanced stability in motion generation across diverse text descriptions
Technical Approach
Our approach combines diffusion models with frequency domain processing and consistency regularization to achieve state-of-the-art text-to-motion generation performance.
Recommended citation: Jia, H. et al. (2025). "Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency Loss." arXiv preprint arXiv:2501.18232. https://arxiv.org/abs/2501.18232