ReMASC
ReMASC copied to clipboard
wave header missing extended part of fmt chunk
In our recent experiment, we found that sox and sox-based toolkits (e.g., torchaudio) output a warning "wave header missing extended part of fmt chunk" when loading the .wav files.
This warning doesn't affect the use of the dataset, you can safely ignore it. It can be solved by either of the following method:
- Download the new version of the dataset, we've already fixed that.
- Use "fox_sox_warn.py" to fix the problem, you just need to change the path in line 27 and 28.
The problem of "wave header missing extended part of fmt chunk" is due to a bug of MATLAB 2018 - the software used to process and generate the wave files. According to the definition of WAVE file header, for all formats other than PCM, the Format chunk must have an extended portion. The extension can be of zero length, but the size field (with value 0) must be present. A lot of ReMASC recordings use IEEE_FLOAT format instead of PCM format to provide higher precision but the Matlab audiowrite function doesn't automatically add the zero fmtext code to the wave header, which leads to this warning.
Reference: https://github.com/Distrotech/sox/blob/0242e319c894156bf53cd9446c65f7c9d129008b/src/wav.c http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html
Where to download the new version of the dataset?
Hi @edjekadetje , thanks for your interest.
It has been years so I might be wrong. But I think the IEEE DataPort version is the lastest version and has this problem solved.
-Yuan