flac.exe on 4 GiB PCM: truncates without warning when channelcount*bytepersample is 1, 3, 5 or 15
Expected when trying to flac a malformed .wav with > 2^32-1 bytes of PCM: a rejection, unless --ignore-chunk-sizes is called Typical flac.exe behaviour: the 2^32-1 limit encountered in a partial sample and therefore that error (but, file written - is that desirable?) But: silent truncation (lossy) if: 2^32-1 is an integer number of samples per channel, like in the attached example file.
I've been pondering how to fix this, but I think there is no clean solution possible.
The extra data that follows could belong to the data chunk, but it could also be the start of a new chunk. For this specific file, it is clear it isn't (it is too short to be a new chunk) but a more general approach isn't possible.
Perhaps checking the length of the master RIFF chunk would help here?
Not sure if I understood correctly ... pretty sure I didn't, but anyway:
- Either the file declares audio to be 2^32-1 bytes long - of maximum length - and then it isn't malformed. At least not this kind of malformed, and we are not in the situation I reported at all?
- Or if it declares something else, and then FLAC (without --ignore-chunk-sizes) would object because audio doesn't match declaration?
Was that ... off mark?
- Or if it declares something else, and then FLAC (without --ignore-chunk-sizes) would object because audio doesn't match declaration?
The problem is that there is no way for FLAC to determine whether the audio doesn't match the declaration, except when there is a partial sample. At first I tried to fix this by simply checking whether there was any left-over data after the data chunk (the audio) but it is very well possible there is another chunk (with metadata) that follows the data chunk.
So, the only way to determine where the audio starts and stops is with the chunk length. --ignore-chunk-sizes is a hack (it does warn to use with caution after all) that assumes there is no other chunk after the data chunk, but this is not necessarily the case.
FLAC should warn that there is data trailing the data chunk (which could be audio or another chunk) and it doesn't do that indeed. That approach would result in a warning, not an error.
The issue was not about having flac --ignore-chunk-sizes tell audio from non-audio, it was about behaviour in the absence of --ignore-chunk-sizes . If I understand the following correctly, it hits spot on to what I tried to raise this issue over:
FLAC should warn that there is data trailing the data chunk (which could be audio or another chunk) and it doesn't do that indeed. That approach would result in a warning, not an error.
- That means: FLAC should report back (as "warning" or "error", that is a policy question) that the specified chunk sizes are wrong, and that the user must consider --ignore-chunk-sizes with the risks that incurs? Yes, that is my point.
- But, observed behaviour: FLAC does not warn, it happily truncates.
And that isn't good, I think.
- That means: FLAC should report back (as "warning" or "error", that is a policy question) that the specified chunk sizes are wrong, and that the user must consider --ignore-chunk-sizes with the risks that incurs? Yes, that is my point.
But the chunk sizes don't have to be wrong for this trailing data to be present. So, FLAC cannot discern that chunk sizes are wrong. It can only warn that there is data trailing the data chunk.
FLAC should be able to realize that this "WAVE file" is too large to be a WAVE file?
Yes, in this instance that is possible, but that is not a general solution. Even so, I won't fix this before the release of 1.4.0, this need thorough testing and I really want to get a release with fuzzing fixes out.
No, there is no general solution for flac --ignore-chunk-sizes to be able to tell audio from non-audio in trailing data. But that is not what I am reporting.
This is about how FLAC without "--ignore-chunk-sizes" still ignores an obviously wrong chunk size, and thereby - without warning - drops samples (and thus, drops losslessness).
I meant that just looking at the filesize wouldn't be of any use for a smaller file with a wrong chunk size.
You still did not migrate to w64? Oh.
@ValZapod Are you sure you didn't post this at the wrong issue?
@H2Swine I've published a fix in #446. There is another problem though: using --keep-foreign-metadata on this file makes FLAC use a lot of memory until it is killed by my OS. It seems it tries to buffer the complete file in search of metadata past the data chunk. The original issue should be fixed with #446. Let me know if you need a compile to test.