[Fairseq-Librispeech] 'AudioMetaData' object is not iterable"

Open deeplearner01 opened this issue 3 years ago • 1 comments

Hi,

When running the script prepare-librispeech.sh in $fairseq_root/examples/speech_recognition/datasets/prepare-librispeech.sh I got this error: "generated an exception: 'AudioMetaData' object is not iterable"

This error originates from the preparation of train and test jsons step, and is generated from the scritp $fairseq_root/examples/speech_recognition/datasets/asr_prep_json.py which is called by prepare-librispeech.sh

Could you please have a look and see how the error could be fixed?

Many thanks, DL

May 23 '22 14:05 deeplearner01

hi, I met the same question and solved it.

QUESTION: The error from /fairseq/examples/speech_recognition task and file: */speech_recognition/datasets/asr_prep_json.py

In this py file: the error can be located in "function process_sample:" the core error is in one line code in this function : si, _= torchaudio.info(aud_path)

the terminal will print a lot of "generated an exception: 'AudioMetaData' object is not iterable"

SOLUTION: the main reason for this error is torchaudio version. for torchaudio ~0.7.0:

si, ei = torchaudio.info(filepath) sample_rate = si.rate num_frames = si.length num_channels = si.channels precision = si.precision bits_per_sample = ei.bits_per_sample encoding = ei.encoding

for torchaudio 0.8.0: metadata = torchaudio.info(filepath) sample_rate = metadata.sample_rate num_frames = metadata.num_frames num_channels = metadata.num_channels bits_per_sample = metadata.bits_per_sample encoding = metadata.encoding (More details can be seen in https://github.com/pytorch/audio/issues/903)

so you can change the torchaudio version Or : change the code in asr_prep_json.py

si, ei = torchaudio.info(aud_path)
input["length_ms"] = int(
    si.length / si.channels / si.rate / MILLISECONDS_TO_SECONDS
)

si= torchaudio.info(aud_path)
input["length_ms"] = int(
    si.num_frames / si.num_channels / si.sample_rate / MILLISECONDS_TO_SECONDS
)

Hope helpful @deeplearner01

Oct 15 '22 16:10 jiangjin1999