hyzhan
hyzhan
I have try “use_gst=False”, but it seems to be the same as tacotron1? Although the refnet_outputs will change, but the generated audio will hardly change with different reference audio.
What about level of s_error can get an understandable audio ?
@candlewill What about dataset size of "e2e_lpcnet_samples_share.zip". It sounds well. My e2e_demo have some noise, and loss about 3.35 for default parameters.
@m-toman @Rayhane-mamah Fine tune by swapping out the data seem to got a voice that some difference between the fine-tuned data. How to solve this problem if I have not...
@begeekmyfriend @keithito Actually not so complicated...Just like this: from models.attention import LocationSensitiveAttention attention_mechanism = LocationSensitiveAttention(hp.attention_dim, encoder_outputs, hparams=hp, mask_encoder=hp.mask_encoder, memory_sequence_length=input_lengths, smoothing=hp.smoothing, cumulate_weights=hp.cumulative_weights) and replace origin code in AttentionWrapper "BahdanauAttention(hp.attention_depth, encoder_outputs)" with...
@jjery2243542 Is it enough to use only the reconstruction loss of the audio, does not need classification loss?
> The current version's prosody is weird and I trying to fix it. Maybe I will train the better version and release the checkpoint this month. The prosody of the...
Maybe the audio is too short.