shreeshailgan

Results 4 issues of shreeshailgan

In `Section 3.1`, under `Model Configuration`, the paper states that the decoder consists of 4 FFT Transformer blocks. However, the provided checkpoints (and the `model.yaml` configs) have 6 FFT Transformer...

Hey @ming024, Could you specify the MFA version you used to generate the textgrids you have provided in your repo? Also, did you generate those textgrids by just aligning using...

I am running `mfa validate` on the LibriTTS-train-clean-460 dataset using an IPA dictionary I have. The output contains: ``` WARNING 288196total OOV tokens ``` However, in the generated `oov_counts.txt` file...

When training the NS2 model, for calculating the CE-RVQ loss, we have the `diff_ce_loss` method: https://github.com/open-mmlab/Amphion/blob/d33551476d792e608c13cec1bfa32283c868a2fb/models/tts/naturalspeech2/ns2_loss.py#L65 This function takes the ground truth indices `gt_indices` and the predicted distribution `pred_dist` For...