Is text-based talking head animation supported or planned?

Open 9mean2 opened this issue 1 month ago • 0 comments

Hi, thanks for your great work!

I would like to ask whether text-based talking-head animation is supported or planned.

Specifically:

Is it possible to generate talking-head animation from text only, without any audio input? (e.g., TTS → talking head, or direct text → motion)

If this is not currently supported:

Are there any plans to integrate text-based animation in the future?

Or would you recommend generating speech audio first (via TTS) and then feeding it to MuseTalk?

Additionally, I would like to confirm: Does MuseTalk generate talking-head motion based on audio signals? Does the model support any emotion / expression control from text, or is motion purely driven by acoustic features?

Thank you. Looking forward to your feedback!

Nov 04 '25 06:11 9mean2