icefall
icefall copied to clipboard
Very minor change in alimeeting recipe (revised)
No need to load_audio. alimeeting audio data is wav format, default export_to_webdataset uses "flac". If load_audio uses default (True), webdataset would show "[Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'" during write data, and then fail afterwards
The flac issue could be related to torchaudio. If you set torchaudio.set_audio_backend(“soundfile”) then the issue should be gone; you can also change flac to sth else which would also resolve the issue.
set audio_format from "flac" to "wav" didn't solve the issue. I didn't try with torchaudio.set_audio_backend(“soundfile”), guess it should work
Hmmm can you show the full stack trace?
2022-08-31 23:36:05,315 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,319 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,323 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,326 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,329 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,335 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,338 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,341 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,345 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,349 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,352 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
Creating WebDataset tarball(s): 16339it [01:42, 205.42it/s]2022-08-31 23:36:05,359 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,362 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,367 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,371 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,375 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,379 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,383 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,386 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,389 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,393 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,396 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,399 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,403 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,407 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,410 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,414 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,422 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,427 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
2022-08-31 23:36:05,432 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'
Creating WebDataset tarball(s): 16358it [01:42, 159.97it/s]
2022-08-31 23:36:05,434 INFO [webdataset.py:164] Exported 0 cuts out of 16358 total into 1 shards (there were 16358 cuts with errors).
2022-08-31 23:36:05,436 INFO [asr_datamodule.py:340] About to create dev dataset
Traceback (most recent call last):
File "./pruned_transducer_stateless2/decode.py", line 621, in
What's your version of lhotse, torch, torchaudio, and webdataset?
lhotse 1.5.0; torch 1.7.1; torchaudio 0.7.2; webdataset 0.2.18
Torchaudio 0.7.2 has an old backend “sox” as default and it doesn’t support the “format” keyword arg. You can fix that by setting eg “torchaudio.set_audio_backend(“soundfile”)”
thanks. BTW, there are some transcripts in the dataset which are likely incorrect (e.g. duration of segment is around 1s, while the text symbols with almost 40 Chinese characters), this causes the training collapse (where the output of encoder T is shorter then total symbol S). And you need a filter to remove those training utterances. I obtained 29.81 WER in dev and 32.25 in test with 15 epochs + 3 avg in modified_beam_search
Thanks! Would you be willing to make a PR that fixes it?
mu.... may be anyone who see this can just add something similar to this (remove_frames_less_tokens) in the train_cuts in his/her own recipe within train.py .
def remove_frames_less_tokens(c: Cut): return c.duration >= len(c.supervisions[0].text)*0.06666 #return ((c.duration * 100 -1) // 2 -1) // 2 >= len(c.supervisions[0].text)
train_cuts = train_cuts.filter(remove_short_and_long_utt)
train_cuts = train_cuts.filter(remove_frames_less_tokens)
I think it is a common issue with conversational audio where sometimes transcript tokens can be longer than the audio. See: https://github.com/k2-fsa/icefall/issues/603 for a related issue. On some experiments with AMI, I found that ~0.2% of segments had this problem.
If you are throwing away too much data in this way, I would suggest increasing your number of BPE tokens.