seamless_communication
seamless_communication copied to clipboard
Finetuning for ASR and the dataset preparation
Hi, I want to finetune the ASR on the custom dataset, so 2 issues have arisen: 1.How can I do the finetuning for ASR? Is it possible to make modifications only on finetune.py? How can I make modifications? 2.What format should I prepare the dataset? What is the content of manifest.json? Can anyone provide the specific content of the manifest.json?
{"source": {"id": 1806, "lang": "eng", "text": "", "audio_local_path": "path to .wav", "waveform": null, "sampling_rate": 16000, "units": null}, "target": {"id": 1806, "lang": "urd", "text": "", "audio_local_path": "path to 491841998166793263.wav", "waveform": null, "sampling_rate": 16000, "units": null}}
write
{"source": {"id": 1806, "lang": "eng", "text": "", "audio_local_path": "path to .wav", "waveform": null, "sampling_rate": 16000, "units": null}, "target": {"id": 1806, "lang": "urd", "text": "", "audio_local_path": "path to 491841998166793263.wav", "waveform": null, "sampling_rate": 16000, "units": null}}
write
thx
can you share notebook which are helpful for the ASR custom finetuning