ReMASC icon indicating copy to clipboard operation
ReMASC copied to clipboard

wave header missing extended part of fmt chunk

Open YuanGongND opened this issue 5 years ago • 2 comments

In our recent experiment, we found that sox and sox-based toolkits (e.g., torchaudio) output a warning "wave header missing extended part of fmt chunk" when loading the .wav files.

This warning doesn't affect the use of the dataset, you can safely ignore it. It can be solved by either of the following method:

  1. Download the new version of the dataset, we've already fixed that.
  2. Use "fox_sox_warn.py" to fix the problem, you just need to change the path in line 27 and 28.

The problem of "wave header missing extended part of fmt chunk" is due to a bug of MATLAB 2018 - the software used to process and generate the wave files. According to the definition of WAVE file header, for all formats other than PCM, the Format chunk must have an extended portion. The extension can be of zero length, but the size field (with value 0) must be present. A lot of ReMASC recordings use IEEE_FLOAT format instead of PCM format to provide higher precision but the Matlab audiowrite function doesn't automatically add the zero fmtext code to the wave header, which leads to this warning.

Reference: https://github.com/Distrotech/sox/blob/0242e319c894156bf53cd9446c65f7c9d129008b/src/wav.c http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html

YuanGongND avatar Jan 29 '20 05:01 YuanGongND

Where to download the new version of the dataset?

edjekadetje avatar Apr 16 '23 09:04 edjekadetje

Hi @edjekadetje , thanks for your interest.

It has been years so I might be wrong. But I think the IEEE DataPort version is the lastest version and has this problem solved.

-Yuan

YuanGongND avatar Apr 16 '23 09:04 YuanGongND