zfp Relative error bound results in out-of-bounds access after zfp_stream_maximum

@treigerm and I are trying to use the relative error bound as suggested in the FAQ:

minbits = 0
maxbits = 0
maxprec = p
minexp = ZFP_MIN_EXP - 1 = -1075

However, in https://github.com/LLNL/zfp/blob/4baa4c7eeae8e0b6a7ace4dde242ac165bcd59d9/src/zfp.c#L743-L785

this configuration results in an invalid capacity, which produces an out-of-bounds access soon after:

maxsize = 0;
maxbits = 9; // float
maxbits += values - 1 + values * p;
maxbits = MIN(maxbits, zfp->maxbits=0);
// maxbits is now 0
maxbits = MAX(maxbits, zfp->minbits);
  
maxsize = ZFP_HEADER_MAX_BITS + blocks * maxbits;
// maxsize is just ZFP_HEADER_MAX_BITS

@lindstro How should we compute the capacity when using the relative error expert mode?

Jul 28 '25 10:07 juntyr

This is sloppy documentation rather than bug in the code. While the "correct" way of invoking this compression mode via the CLI is to set minbits = maxbits = 0 using the -c switch, this is not the correct way when using the high-level C API. Instead, you should set

minbits = ZFP_MIN_BITS
maxbits = ZFP_MAX_BITS
maxprec = p
minexp = ZFP_MIN_EXP - 1

I will fix the documentation. Please let me know if you run into any other issues with this admittedly not-well-tested zfp feature.

I would not expect this mode to give very good compression compared to the other compression modes, though currently it's the only way of ensuring that relative errors are bounded. As the FAQ details, there are other ways of controlling the relative error without hard error bounds.

Jul 28 '25 14:07 lindstro

Thank you very much for your help and your reply!

We’re trying to get this method working at the moment, since we’re looking for strict relative error bounds, but if that doesn’t work out we can also try out the asinh + zfp approach.

Jul 30 '25 17:07 juntyr

Hello, thanks so much for your help!

I'm still having trouble getting the relative error bound functionality to work but I might be using the API wrong. I'm running ZFP through Juniper's https://numcodecs-wasm.readthedocs.io/en/latest/ with the configuration:

minbits = ZFP_MIN_BITS = 1
maxbits = ZFP_MAX_BITS = 16658
maxprec = p = 13
minexp = ZFP_MIN_EXP - 1 = -1075

The observed behaviour I find is that around 50% of the pixels still exceed the error bound for some sample of float32 data.

I just wanted to double check that I am indeed using the API correctly now. I guess the remaining thing to clarify is the exact meaning of the precision p. I took it p to be the remaining bits in the floating point representation that is being truncated, i.e. given a 32-bit float and p = 13 , the trailing 32 - 13 = 19 bits will be set to 0. The remaining bits in the mantissa in this case are 13 - 9 = 4 which should give a relative error bound of 2^-4, is that right?

Aug 01 '25 08:08 treigerm

You are interpreting the precision correctly. That is, the relative error is at most $2^{9-p}$ for floats and $2^{12-p}$ for doubles.

To verify this, I ran an experiment on a couple of data sets: the Miranda density and velocityz fields from SDRBench. While the error bound holds for the density field, it does not for the velocity field. I suspect that this happens because zfp's reversible mode is used only for those blocks of data that the lossy algorithm cannot represent exactly in full precision, and the analysis does not hold for blocks compressed by the lossy algorithm unless full precision is used. That is just a hunch, however.

I need to dig deeper into this to better understand what is going on. If my suspicion is right, then the algorithm needs to be modified to always use the reversible path whenever less than full precision is requested.

Aug 01 '25 14:08 lindstro

After some investigation, it turns out that there are two issues:

As I suspected, the error bound does not hold when the "lossy path" of the compression algorithm is taken when less than full precision is requested. The reason has to do with this path making use of a block-floating-point representation that does not keep values normalized. Thus, quantization errors resulting from reduced precision creep into more significant bits for "subnormal" values in the block (those whose exponent is smaller than that of the largest-in-magnitude value). For the same reason, the relative error bound cannot be guaranteed for subnormal floating-point numbers (even when the "lossy path" is not taken). Such subnormals numbers (other than zero) are in practice extremely rare, however.
A more serious issue is faulty reasoning underlying the error bound. As with the absolute error bound, one has to consider that quantization errors can magnify by up to 8x per dimension when applying the reversible inverse decorrelating transform, because the maximum absolute row sum of the transformation matrix is 1 + 3 + 3 + 1 = 8: https://github.com/LLNL/zfp/blob/4baa4c7eeae8e0b6a7ace4dde242ac165bcd59d9/src/template/revdecode.c#L18-L21 Thus, the correct relative error bound (when the "lossless path" is taken) is $2^{3(d+3)-p}$ for single- and $2^{3(d+4)-p}$ for double-precision data in $d$ dimensions. We need to decide if forcing the lossless path when less than full precision is requested is a desirable change in behavior, though it is doubtful that anyone would want to preserve the current behavior that provides no meaningful guarantees when combining reversible mode with reduced precision.

For now, to verify that the error bound holds, you may replace this line https://github.com/LLNL/zfp/blob/4baa4c7eeae8e0b6a7ace4dde242ac165bcd59d9/src/template/revencodef.c#L13 with return zfp_false; and account for the data dimensionality, $d$, when computing the error bound. Note that this simple code change also impacts the compression ratio in fully lossless mode.

Once we've decided how to move forward, we will update the documentation and code to address this issue.

Aug 02 '25 19:08 lindstro

Relative error bound results in out-of-bounds access after zfp_stream_maximum_size