megatts2
megatts2 copied to clipboard
Don't see GE mentioned in the paper
https://github.com/LSimon95/megatts2/blob/c9ca2a88febf9db2cf4d8da0860efc9948db2b76/modules/mrte.py#L63
paper:https://arxiv.org/abs/2307.07218
reference:
GE was removed from the current version to test core MRTE's performance and I can't find the exact structure of GE. Maybe I will add to the newer 24k version for comparison.
GE was removed from the current version to test core MRTE's performance and I can't find the exact structure of GE. Maybe I will add to the newer 24k version for comparison.
i found timbre encoder description in paper:https://arxiv.org/pdf/2306.03509.pdf (megatts).
i think this can be for your reference:
Does anyone know what the difference is between version 1 and version 4 of the paper 'MEGATTS2' on arXiv? I am really confused. The structure of MEGATTS2 differs between v1 and v4. In v4, the prompt's Conditions of PLLM only use Zc, whereas in v1, it uses Hct. Does this mean that timbre information is no longer needed? Additionally, v4 does not mention GE. Does this mean that GE is not important?
v1:
v4: