wav2vec2-sprint issues

Generate the N-best (top few) hypotheses

Hello Everyone, I'm currently working on a project involving Wav2vec2 Automatic Speech Recognition (ASR) system. Specifically, I'm trying to generate an N-best list of hypotheses using the Wav2vec2 ASR. I've...

cyfer0618

How to get the phoneme?

2

Hi, bro, very appreciate for your work. The method to load the model can directly get the word. But I want the phoneme. Do you know how to get the...

psuu0001

Export model to ONNX

2

Hello, do you have any idea how to export the models to the onnx format? I tried a script from huggingface: https://github.com/huggingface/transformers/blob/master/src/transformers/convert_graph_to_onnx.py but it gives an error: `Error while converting...

marlon-br

YAML Metadata Error: "language[0]" must only contain lowercase characters for zh-CN

I am using common voice dataset, and the language is zh-CN. But when I upload to HF, it will show YAML Metadata Error: "language[0]" must only contain lowercase characters for...

haloha123

wav2vec2-xls-r-1b_cv8_it

Hi Jonatas, I'm trying replicate your performance about to fine-tuned facebook/wav2vec2-xls-r-1b on Italian using the train and validation splits of Common Voice 8.0 but I can't get nearly the same...

ghost

Predictions are always pad_token, logits are always the same distribution for different time_step and speech utterances.

2

Hi, thanks for your wav2vec2 fine-tuning scripts first. Recently, I used this script (run_common_voice.py) to fine-tune the newest **Hubert model** for ASR on Chinese language (zh-CN). Hubert basically has the...

louislau1129

Wav2Vec2-XLSR-German-53

1

Hello! Congrats on the repo! One question, what dataset you used for fine-tuning the Wav2Vec2-XLSR-German-53 and how long was the data? It was only German language, right?

jvel07

I can't run to train model

2

Hello, I am trying to launch run_common_voice.py but I got this error "argument --gradient_checkpointing: conflicting option string: --gradient_checkpointing". I ran stript like finetune.sh

Elvares

how much does the model gain in performance doing audio augmentations ?

1

hey @jonatasgrosman , could you please tell me how much of performance improvements did you see after doing the noise augmentation steps, i have also found additional audio augmentations, would...

StephennFernandes

settings

4

Hi Jonatas, I'm trying to replicate your performance on the dutch language but can't get nearly the same WER. What are the parameters you use to train the final model?...

B3AU

wav2vec2-sprint
wav2vec2-sprint copied to clipboard

Metadata

Generate the N-best (top few) hypotheses

How to get the phoneme?

Export model to ONNX

YAML Metadata Error: "language[0]" must only contain lowercase characters for zh-CN

wav2vec2-xls-r-1b_cv8_it

Predictions are always pad_token, logits are always the same distribution for different time_step and speech utterances.

Wav2Vec2-XLSR-German-53

I can't run to train model

how much does the model gain in performance doing audio augmentations ?

settings

← Metadata

Owner

Metadata

wav2vec2-sprint wav2vec2-sprint copied to clipboard

Metadata

← Metadata

Owner

Metadata

wav2vec2-sprint
wav2vec2-sprint copied to clipboard