flac icon indicating copy to clipboard operation
flac copied to clipboard

The compress data size is bigger than the original pcm data?

Open yixinuestc opened this issue 5 months ago • 6 comments

Hello,everyone: Our product use flac codec for serveal years,it worked well.Now we notice that some times the compress data data size is bigger than the original pcm data,we are very confused. Is it the compress data absolutely has less size than the original pcm data,if not, what is the maximum size of the compress data.

Here is the details: 1. Our flac version is 1,3,3 2. frame size is 1024 samples,that is we encode 1024 samples one time. 3. other parameters are: 2 channels, int16 format,set compression to 5; sample rate is 48000. 4.We use FLAC__stream_encoder_init_stream api and set the write_callback function.
use FLAC__stream_encoder_process_interleaved api for encode,and in the write callback function to get the compress data and size. 5. our raw pcm data size is 4096 bytes (for 2 channel,int16 format).but some times the compress data is a little bigger than 4096. 6. kylin os

Thanks in advance.

yixinuestc avatar Jul 19 '25 03:07 yixinuestc

anybody knows?

yixinuestc avatar Jul 22 '25 06:07 yixinuestc

You can try a newer flac version.

If the issue you observe happens under Windows, then the following info can be relevant:

The commit, which actually fixed the reported issue concerning compression efficiency depending on the cpu type, is: https://github.com/xiph/flac/commit/aaaaa0deb92654fd684042d402d75338def17364 See also: https://github.com/xiph/flac/commit/94a61241b02064c7d9fe508f72a742f2a90b8492 and https://www.mail-archive.com/[email protected]/msg04176.html

... incorrectly compiles stream_encoder_intrin_*.c files for x86-64 platform. As a result, flac works, but compression ratio is close to 1. This patch disables some compiler optimizations, and compression ratio reverts back to normal values.

c72578 avatar Jul 22 '25 09:07 c72578

currently, this issue only happens under kylin os,other os do not happend. Is the flac relevant to different os? I think the flac is not relevant to different os?

yixinuestc avatar Jul 23 '25 08:07 yixinuestc

Please provide an output file if possible. With just the information you provided, I cannot provide any insights.

ktmf01 avatar Jul 23 '25 10:07 ktmf01

Like most lossless compression schemes, flac has a 'verbatim' coding mode to bound the size of the compressed output to the size of the input. However, you say some frames are "a little bigger" than 4096 bytes. Is that consistent with 4096 bytes for the data plus the framing overhead?

The addition of framing and other headers is a way a valid flac file can be slightly larger than raw input data. If you can provide example output, others could examine it and confirm if this is what is happening.

None of this explains why the result would be specific to kylin os. There must be something different about that encoder which is producing unnecessarily large output, perhaps by choosing the verbatim coding mode when it is not optimum, or adding extra padding blocks.

rillian avatar Jul 23 '25 12:07 rillian

@yixinuestc : You need to provide more detail. You say the files are 4096 samples long, so you could attach a couple of them here (I think you must put it in a .zip file to get it attached).

It isn't so that FLAC encodes "bitstreamX" to "given bitstream F(X)" - there are enormously many different valid FLAC files that decode to "X", and an encoder has to quickly make up its mind on which method to choose. That is done with a quick evaluation algorithm which may do different from build to build.

And FLAC has the opportunity to "give up on compression and store VERBATIM". There have been cases where some compiles - including reference FLAC - miscalculated what is the "reasonably best F(X)" and would then choose VERBATIM (or "even worse") instead of something that actually compresses. There is nothing "lossy" about this: an "I don't think I can improve this so I pass on the samples as they are" doesn't change the signal, it just doesn't ... compress. And indeed it does a bit worse, because it still has to write headers and footers including checksums and stuff that we want to have even if it takes a little bit of space.

But if you have a BIG number of 4096-samples files, there will every now and then be some that are best encoded VERBATIM and they will be slightly bigger than the WAVE.

H2Swine avatar Jul 24 '25 09:07 H2Swine