SPERR PWE tolerance is violated when very small

Describe the bug The bug occurs for data with a very small range near single-precision machine epsilon and when the tolerance is smaller than the smallest representable single-precision positive number.

To Reproduce Steps to reproduce the behavior:

Compile the following code to generate test data:

#include <cmath>
#include <cstdio>

int main()
{
  double x = 1. / 3;
  for (int i = 0; i < 16 * 16 * 16; i++) {
    double y = std::ldexp(x, -150);
    fwrite(&y, sizeof(y), 1, stdout);
    x = 4 * x * (1 - x);
  }
  return 0;
}

Run this executable and redirect output to data.in. (Downloadable data.tar.gz)
Execute the following:

compressor_3d -d --dims 16 16 16 --pwe 1e-50 -o data.sperr data.in
decompressor_3d -d -o data.out data.sperr

Expected and actual behavior The maximum error between data.in and data.out is 2.68e-50, which is larger than the tolerance 1e-50. Perhaps some quantity that should be represented in double precision is actually represented in single precision?

Environment (please complete the following information):

OS: macOS 11.7.2
Compiler: AppleClang 13.0.0.13000029
SPERR Version: 0.5.2

NOTE: The version used is from a few weeks ago. The "latest" 0.5.2 (3918121-dirty) appears broken as it no longer recognizes -o yet won't run without -o.

Jan 27 '23 17:01 lindstro

NOTE: The version used is from a few weeks ago. The "latest" 0.5.2 (3918121-dirty) appears broken as it no longer recognizes -o yet won't run without -o.

Never mind--it appears that the -o option for the compressor is no longer recognized, but it is required for the decompressor.

Jan 27 '23 19:01 lindstro

Never mind--it appears that the -o option for the compressor is no longer recognized, but it is required for the decompressor.

No worries Peter. FYI, I've changed the compressor so that it can do more than one type of output (Output the compressed bitstream and the double/single precision decompressed volume). As a result, one can get the decompressed volume in one step rather than two.

I've noticed this behavior before, I had some initial investigation and decided that it's the floating point rounding error issue. Let me do more investigation now.

Jan 27 '23 19:01 shaomeng

Some additional data points. When specifying the tolerance as an exact power of two using hexadecimal literals, e.g., 0x1p-149, then we obtain the following tolerance and observed error:

pwe = 2^-149 ≈ 1.40e-45 > maxerr ≈ 3.52e-46 [OK] pwe = 2^-150 ≈ 7.01e-46 > maxerr ≈ 5.43e-46 [OK] pwe = 2^-151 ≈ 3.50e-46 < maxerr ≈ 7.86e-46 [FAIL]

Here FLT_TRUE_MIN = 2^-149. Thus, when 2 pwe is representable as a float, all is well, but when pwe is smaller the tolerance is not respected. I suspect this has something to do with the quantization step being 2 pwe.

Jan 27 '23 19:01 lindstro

I think the issue is here: https://github.com/NCAR/SPERR/blob/3918121d0b192042a425cd3b21ff2f1d151b0952/src/SPERR.cpp#L162-L163 When pwe = 2^-152, max_t ≈ 6.94e-46 < FLT_TRUE_MIN ≈ 1.40e-45. Hence, m_max_threshold_f = m_threshold = 0, which wreaks havoc with the subsequent loop. However, setting m_threshold = max_t is not sufficient as m_max_threshold_f is stored in the compressed stream and fed to the decompressor.

What's the reason for this explicit cast to float, and why should the threshold not be represented in double precision?

Jan 27 '23 22:01 lindstro

What's the reason for this explicit cast to float, and why should the threshold not be represented in double precision?

It might sound silly now, but I was thinking 1) the maximum threshold is many times bigger than PWE, so a single-precision representation is sufficient, and 2) the maximum threshold is saved in the header, so I can save 4 bytes by using a float.

I'll run some tests to see if changing to double precision resolves the issue. (I was studying what happens when operands are close to the machine epsilon...)

Jan 27 '23 23:01 shaomeng

Hi Peter, I've tested storing double in the outlier corrector, and it really solves the problem! (I pushed the fix to the main branch.) It is really amazing to learn the details about floating point numbers, and you're the only one I know who pays attention to things happening in these tiny ranges! Thank you!

Btw, in the code you posted, is it a special sequence that you used to generate values between 0 and 1? I'm talking about the variable x there.

Maybe another topic, I also noticed that the percentage of outliers grow way out of our expected range to something around 87% or 88% when I continue reducing the PWE tolerance. I'll probably need to borrow your brain to look at it at some time ;)

Jan 29 '23 06:01 shaomeng

Sam, thanks for fixing this. For some of the work we're doing at LLNL, we need "small" tolerances below FLT_TRUE_MIN that actually are large in relation to the data being compressed (perhaps on the order of 1-10% of the range), and in those situations everything breaks if the tolerance is violated.

Regarding the test data, this is known as the logistic map, which produces quasi-random sequences in a reasonably portable manner (for a given rounding mode). I find it useful for generating data that is difficult to compress.

Jan 29 '23 19:01 lindstro

Sam, thanks for fixing this. For some of the work we're doing at LLNL, we need "small" tolerances below FLT_TRUE_MIN that actually are large in relation to the data being compressed (perhaps on the order of 1-10% of the range), and in those situations everything breaks if the tolerance is violated.

For those originally double precision data, what's the compression ratio you get when using below FLT_TRUE_MIN tolerances? I imagine that the cost-benefit calculation is going to shift at that point.

Jan 29 '23 20:01 shaomeng

We were seeing lots of outliers before the fix, which really hurt compression. Once we have some new results, I'll get back to you. Either way, it shouldn't matter if the tolerance is 1 or 1e-100; what matters is the tolerance relative to the data range.

Jan 30 '23 01:01 lindstro

The reported issue is really solved, so closing the issue.

Oct 03 '24 23:10 shaomeng