icefall icon indicating copy to clipboard operation
icefall copied to clipboard

Very minor change in alimeeting recipe (revised)

Open rickychanhoyin opened this issue 3 years ago • 11 comments

No need to load_audio. alimeeting audio data is wav format, default export_to_webdataset uses "flac". If load_audio uses default (True), webdataset would show "[Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format'" during write data, and then fail afterwards

rickychanhoyin avatar Aug 31 '22 05:08 rickychanhoyin

The flac issue could be related to torchaudio. If you set torchaudio.set_audio_backend(“soundfile”) then the issue should be gone; you can also change flac to sth else which would also resolve the issue.

pzelasko avatar Aug 31 '22 11:08 pzelasko

set audio_format from "flac" to "wav" didn't solve the issue. I didn't try with torchaudio.set_audio_backend(“soundfile”), guess it should work

rickychanhoyin avatar Aug 31 '22 15:08 rickychanhoyin

Hmmm can you show the full stack trace?

pzelasko avatar Aug 31 '22 15:08 pzelasko

2022-08-31 23:36:05,315 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,319 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,323 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,326 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,329 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,335 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,338 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,341 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,345 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,349 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,352 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' Creating WebDataset tarball(s): 16339it [01:42, 205.42it/s]2022-08-31 23:36:05,359 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,362 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,367 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,371 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,375 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,379 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,383 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,386 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,389 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,393 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,396 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,399 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,403 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,407 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,410 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,414 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,422 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,427 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' 2022-08-31 23:36:05,432 WARNING [utils.py:735] [Suppressed TypeError] Error message: save() got an unexpected keyword argument 'format' Creating WebDataset tarball(s): 16358it [01:42, 159.97it/s] 2022-08-31 23:36:05,434 INFO [webdataset.py:164] Exported 0 cuts out of 16358 total into 1 shards (there were 16358 cuts with errors). 2022-08-31 23:36:05,436 INFO [asr_datamodule.py:340] About to create dev dataset Traceback (most recent call last): File "./pruned_transducer_stateless2/decode.py", line 621, in main() File "/home/ricky/test-icefall/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context return func(*args, **kwargs) File "./pruned_transducer_stateless2/decode.py", line 597, in main dev_dl = alimeeting.valid_dataloaders(cuts_dev_webdataset) File "/home/ricky/k2/icefall/egs/alimeeting/ASR/pruned_transducer_stateless2/asr_datamodule.py", line 354, in valid_dataloaders valid_sampler = DynamicBucketingSampler( File "/home/ricky/test-icefall/lib/python3.8/site-packages/lhotse/dataset/sampling/dynamic_bucketing.py", line 146, in init self.duration_bins = estimate_duration_buckets( File "/home/ricky/test-icefall/lib/python3.8/site-packages/lhotse/dataset/sampling/dynamic_bucketing.py", line 274, in estimate_duration_buckets assert num_buckets <= durs.shape[0], ( AssertionError: The number of buckets (10) must be smaller than or equal to the number of cuts (0).

rickychanhoyin avatar Aug 31 '22 15:08 rickychanhoyin

What's your version of lhotse, torch, torchaudio, and webdataset?

pzelasko avatar Aug 31 '22 16:08 pzelasko

lhotse 1.5.0; torch 1.7.1; torchaudio 0.7.2; webdataset 0.2.18

rickychanhoyin avatar Sep 01 '22 04:09 rickychanhoyin

Torchaudio 0.7.2 has an old backend “sox” as default and it doesn’t support the “format” keyword arg. You can fix that by setting eg “torchaudio.set_audio_backend(“soundfile”)”

pzelasko avatar Sep 03 '22 00:09 pzelasko

thanks. BTW, there are some transcripts in the dataset which are likely incorrect (e.g. duration of segment is around 1s, while the text symbols with almost 40 Chinese characters), this causes the training collapse (where the output of encoder T is shorter then total symbol S). And you need a filter to remove those training utterances. I obtained 29.81 WER in dev and 32.25 in test with 15 epochs + 3 avg in modified_beam_search

rickychanhoyin avatar Sep 03 '22 00:09 rickychanhoyin

Thanks! Would you be willing to make a PR that fixes it?

pzelasko avatar Sep 03 '22 00:09 pzelasko

mu.... may be anyone who see this can just add something similar to this (remove_frames_less_tokens) in the train_cuts in his/her own recipe within train.py .

def remove_frames_less_tokens(c: Cut): return c.duration >= len(c.supervisions[0].text)*0.06666 #return ((c.duration * 100 -1) // 2 -1) // 2 >= len(c.supervisions[0].text)

train_cuts = train_cuts.filter(remove_short_and_long_utt)
train_cuts = train_cuts.filter(remove_frames_less_tokens)

rickychanhoyin avatar Sep 03 '22 01:09 rickychanhoyin

I think it is a common issue with conversational audio where sometimes transcript tokens can be longer than the audio. See: https://github.com/k2-fsa/icefall/issues/603 for a related issue. On some experiments with AMI, I found that ~0.2% of segments had this problem.

If you are throwing away too much data in this way, I would suggest increasing your number of BPE tokens.

desh2608 avatar Nov 14 '22 21:11 desh2608