Amphion
Amphion copied to clipboard
[Help]: In the S2A model, have you tried applying classifier free guidance on the semantic condition?
Hi, Thank you so much for the great open source work.
I found that in the second stage S2A model, the classifier free guidance is only applied on the acoustic prompt. I am wondering have you tried applying the CFG also on the conditioning semantic tokens (maybe that will improve WER), or on both the prompt and the semantic tokens?