self-supervised-speech-recognition icon indicating copy to clipboard operation
self-supervised-speech-recognition copied to clipboard

speech to text with self-supervised learning based on wav2vec 2.0 framework

Results 29 self-supervised-speech-recognition issues
Sort by recently updated
recently updated
newest added

I want to know what is the tgt_dict in the stt file, in the process_predictions method to use tgt_words and hypo_words to calculate the edit distance, there is no label...

hello ,My own kenlm language model is always ineffective. I want to use the official kenlm, but the format of lexicon does not match. How can I apply the kenlm...

How can I test the WER of the total model that I trained

Hi, If you do an inference with a `Transcriber` object `t.transcribe(..)`, after returning the result, it should release any resources related to that inference. But it stays on VRAM and...

I run the following command to finetune the model: `python finetune.py --transcript_file ./cv-corpus-6.1-2020-12-11/vi/clips/clips.trans.txt --pretrain_model /content/self-supervised-speech-recognition/outputs/2021-06-25/14-39-00/checkpoints/checkpoint_best.pt --dict_file /content/self-supervised-speech-recognition/save_dir/dict.ltr.txt` and I get the following logs: ``` 2021-06-25 15:31:21 | INFO | fairseq_cli.train...

Below is my coding https://github.com/epona7471/YoonKang.github.io/blob/main/install.ipynb As guided at other issues, I uninstalled colab cuda 11.0 and reinstalled a cuda 10.1 and any other suggested stuff to debug, I still get...

Hello , I had a pretrained model with extension .pth, how can I do to convert this pretrained-model to .pt so that I can use this model to finetune? Thanks!!!

Hi... As recommended on GitHub, the best size of chunks is 10 to 30 seconds. However, the Librispeech dataset was split into various sizes starts from 2 secs. My question...

Hi all After following Install Instruction and downloading your Pre-trained models I executed this code in colab: ``` from stt import Transcriber transcriber = Transcriber(pretrain_model = '/content/vietnamese_wav2vec/pretrain.pt', finetune_model = '/content/vietnamese_wav2vec/finetune.pt',...

Hi, I trained the model under Linux and every thing went fine. However, I wish to use the model under windows for prediction only. But, Kenlm and wav2letter cannot be...