xcodec icon indicating copy to clipboard operation
xcodec copied to clipboard

diffusion-based model results

Open yl4579 opened this issue 1 year ago • 1 comments

Great work! Have you tested the performance of this codec on diffusion-based models such as SimpleTTS or DiTTo-TTS?

yl4579 avatar Sep 06 '24 05:09 yl4579

Thank you very much for your question! I have not tested this codec with diffusion-based models such as SimpleTTS or DiTTo-TTS. However, I believe investigating which representations—such as mel, codec latent, or semantic—are better suited for audio diffusion generation could yield valuable insights. Thank you once again for your thoughtful inquiry.

zhenye234 avatar Sep 07 '24 12:09 zhenye234