[Fairseq-Librispeech] 'AudioMetaData' object is not iterable"
Hi,
When running the script prepare-librispeech.sh in $fairseq_root/examples/speech_recognition/datasets/prepare-librispeech.sh I got this error: "generated an exception: 'AudioMetaData' object is not iterable"
This error originates from the preparation of train and test jsons step, and is generated from the scritp $fairseq_root/examples/speech_recognition/datasets/asr_prep_json.py which is called by prepare-librispeech.sh
Could you please have a look and see how the error could be fixed?
Many thanks, DL
hi, I met the same question and solved it.
QUESTION: The error from /fairseq/examples/speech_recognition task and file: */speech_recognition/datasets/asr_prep_json.py
In this py file: the error can be located in "function process_sample:" the core error is in one line code in this function : si, _= torchaudio.info(aud_path)
the terminal will print a lot of "generated an exception: 'AudioMetaData' object is not iterable"
SOLUTION: the main reason for this error is torchaudio version. for torchaudio ~0.7.0:
si, ei = torchaudio.info(filepath) sample_rate = si.rate num_frames = si.length num_channels = si.channels precision = si.precision bits_per_sample = ei.bits_per_sample encoding = ei.encoding
for torchaudio 0.8.0: metadata = torchaudio.info(filepath) sample_rate = metadata.sample_rate num_frames = metadata.num_frames num_channels = metadata.num_channels bits_per_sample = metadata.bits_per_sample encoding = metadata.encoding (More details can be seen in https://github.com/pytorch/audio/issues/903)
so you can change the torchaudio version Or : change the code in asr_prep_json.py
si, ei = torchaudio.info(aud_path)
input["length_ms"] = int(
si.length / si.channels / si.rate / MILLISECONDS_TO_SECONDS
)
to
si= torchaudio.info(aud_path)
input["length_ms"] = int(
si.num_frames / si.num_channels / si.sample_rate / MILLISECONDS_TO_SECONDS
)
Hope helpful @deeplearner01