megatts2 icon indicating copy to clipboard operation
megatts2 copied to clipboard

Don't see GE mentioned in the paper

Open skysbird opened this issue 1 year ago • 3 comments

https://github.com/LSimon95/megatts2/blob/c9ca2a88febf9db2cf4d8da0860efc9948db2b76/modules/mrte.py#L63

image

paper:https://arxiv.org/abs/2307.07218

reference: image

skysbird avatar Feb 20 '24 07:02 skysbird

GE was removed from the current version to test core MRTE's performance and I can't find the exact structure of GE. Maybe I will add to the newer 24k version for comparison.

LSimon95 avatar Feb 20 '24 08:02 LSimon95

GE was removed from the current version to test core MRTE's performance and I can't find the exact structure of GE. Maybe I will add to the newer 24k version for comparison.

i found timbre encoder description in paper:https://arxiv.org/pdf/2306.03509.pdf (megatts).

i think this can be for your reference:

image

skysbird avatar Feb 20 '24 10:02 skysbird

Does anyone know what the difference is between version 1 and version 4 of the paper 'MEGATTS2' on arXiv? I am really confused. The structure of MEGATTS2 differs between v1 and v4. In v4, the prompt's Conditions of PLLM only use Zc, whereas in v1, it uses Hct. Does this mean that timbre information is no longer needed? Additionally, v4 does not mention GE. Does this mean that GE is not important? v1: image image v4: image image

fighting-zeng avatar Jun 18 '24 03:06 fighting-zeng