GoAudio
GoAudio copied to clipboard
Handle 'JUNK' chunks when reading in wave files
I was trying to use this module to read in this wave file and was having some trouble.
It looks like before the "fmt" chunk, it's possible to have a chunk that begins with "JUNK" that should be skipped.
I believe it's described in this page talking about the "RIFF" format: https://www.daubnet.com/en/file-format-riff The fix would go around here: https://github.com/DylanMeeus/GoAudio/blob/master/wave/reader.go#L60
Tangentially, there was a similar issue on the tensorflow project: https://github.com/tensorflow/tensorflow/issues/26247
Hi @ryjose1
That's a good catch, I had no idea! I'll take a look at this, thank you for linking a wave file for testing.
@ryjose1
Do you mind trying against the latest master? Should have a fix in place!
Sorry for the delay, life got a little busy. It looks like this file is weird because it also has 24 bits per sample; I'm no longer crashing when reading in the file, but it looks like I'm erroring out when writing the file back to disk for testing.
For context, I'm taking the file and putting it through the following flow:
file ->
wav, err := wave.ReadWaveFile(file) ->
batches := wave.BatchSamples(wav, 1.0) ->
wave.WriteFrames(batch, waveFmt, "file_pt1.wav")
where
batch comes from
for _, batch := range batches{}
waveFmt := wave.NewWaveFmt(1, 2, 44100, 24, []byte{})
No worries about the delay, thanks for testing! I will look into writing the 24 bit files soonish.
So this file is pretty odd.. in the "Fmt" part, it is larger than the expected size, but it doesn't report that..
// FMT starts here (6d66 2074)
00002d0 0000 6d66 2074 0028 0000 0001 0002 bb80
00002e0 0000 6500 0004 0006 0018 0000 0000 0000
00002f0 0000 0000 0000 0000 0000 0000 0000 0000
0000300 0000 696d 666e 0010 0000 a1c0 b21a 92bf
0000310 01d2 0002 0000 0000 0000 6c65 316d 00d6
0000320 0000 0000 0000 0000 0000 0000 0000 0000
That is the entire content of the Fmt Block. Per default, it should be 16b wide, unless it reports more. The bits that are supposed to report "more data" are set to 0 - yet it does have more than 16b.
Very strange file indeed - I'll have to look a bit more into how I can handle such cases gracefully.
For some context, this file isn't a blocker for me, though it's just my luck that the arbitrary test file I picked has so many quirks. Thanks for your help so far/going forward!