Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency Loss

Published in arXiv preprint, 2025

Abstract

This paper proposes Free-T2M, a frequency enhanced text-to-motion diffusion model that significantly improves the quality and stability of generated motions. By introducing a consistency loss function and enhancing frequency information, our model generates more natural and text-aligned human motions.

Key Contributions

  • Frequency Enhancement: Novel approach to enhance frequency information in motion generation
  • Consistency Loss: Improved training stability through carefully designed consistency loss
  • Superior Performance: Demonstrates improved motion quality and text-motion alignment
  • Robust Generation: Enhanced stability in motion generation across diverse text descriptions

Technical Approach

Our approach combines diffusion models with frequency domain processing and consistency regularization to achieve state-of-the-art text-to-motion generation performance.

Download paper here

Recommended citation: Jia, H. et al. (2025). "Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency Loss." arXiv preprint arXiv:2501.18232. https://arxiv.org/abs/2501.18232