Mitchell DeHaven comments

Results 10 comments of


                                            Mitchell DeHaven

trafficstars

程序未报错，但是wer结果100%

@busishengui Did you ever resolve this issue? Having similar issues on a different dataset.

Use a smaller model to speed up the prediction time

@Giovani-Merlin Is that repo still active? The link is now dead.

Some improvements / bug fixes

@bzp83 The `--patience` flag essentially tells you how many epochs can elapse without an improvement in the best validation loss before it terminates training. I was using a patience of...

convert "AISHELL-1/ASR/transformer" to onnx or nvidia tensorRT engine

STFT is onnx exportable, you just need `return_complex=False` in the torch.stft definition (as onnx supports STFT, but not with complex values).

convert "AISHELL-1/ASR/transformer" to onnx or nvidia tensorRT engine

> Using the following repro: > > ```python > from speechbrain.pretrained import EncoderDecoderASR > import torch > > asr_model= EncoderDecoderASR.from_hparams(source="speechbrain/asr-wav2vec2-transformer-aishell", savedir="pretrained_models/asr-transformer-aishell") > asr_model.eval() > wavs = torch.rand(1,34492) > wav_lens =...

How will I train the model and get the pickle files

Has anyone gotten this code working that could give the steps required to get it training?

Tiny model?

> `distil-small.en` is released here: https://huggingface.co/distil-whisper/distil-small.en > > It's quite hard to compress further than this without loosing WER performance: https://huggingface.co/distil-whisper/distil-small.en#why-is-distil-smallen-slower-than-distil-large-v2 Is there any way we can access the small...

Tiny model?

> > Is there any way we can access the small 2-layer decoder variant? > > Yes, _c.f._ https://huggingface.co/distil-whisper/distil-small.en @sanchit-gandhi From https://huggingface.co/distil-whisper/distil-small.en: > While distil-medium.en and distil-large-v2 use two decoder...

No training possible on RTX 4090: CUFFT_INTERNAL_ERROR with torch < 2 (WSL2 & native Ubuntu Linux)

If you don't mind editing the code and don't want to change versions for whatever reason, you can simply modify the device for that portion of the model to run...

distil-whisper-turbo

The speed improvement from 4 decoder layers to 2 decoder layers would probably be negligible.