FreeVC
FreeVC copied to clipboard
I have a question about your WER/CER results in the paper.
In your paper, you report WER and CER results of about 4.23% and 1.46%. Also, you mentioned that you used https://huggingface.co/facebook/hubert-large-ls960-ft as the ASR model.
But, when using the same ASR model on ground truth VCTK utterances, I get WER/CER of about 6.43% and 1.95%. So I assume our codes for measuring WER/CER are different.
Could you share the code for evaluating WER/CER? Or at least a code fragment of it? Thank you.