netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

Bug: quantize_mode="GranularBitRound" replaces nans by "-0".

Open garciampred opened this issue 1 year ago • 5 comments

This is likely an upstream bug, but I don't know how to reproduce it with netcdf-c. BitRound and BitGroom are not affected.

import numpy as np
import netCDF4 as ncdf
data = np.array([5.3, 6.2, 7.3, np.nan])
nc = ncdf.Dataset("test.nc", "w")
nc = ncdf.Dataset("test.nc", "w")
nc.createDimension("x", 4)
v = nc.createVariable("tas", "float32", "x", shuffle=True, compression="zlib", significant_digits=4, quantize_mode="GranularBitRound")
v[:] = data
nc.close

Then ncdump returns

netcdf test {
dimensions:
	x = 4 ;
variables:
	float tas(x) ;
		tas:_QuantizeGranularBitRoundNumberOfSignificantDigits = 4 ;
		tas:_Storage = "chunked" ;
		tas:_ChunkSizes = 4 ;
		tas:_DeflateLevel = 4 ;
		tas:_Shuffle = "true" ;
		tas:_Endianness = "little" ;

// global attributes:
		:_NCProperties = "version=2,netcdf=4.9.2,hdf5=1.14.3" ;
		:_SuperblockVersion = 2 ;
		:_IsNetcdf4 = 1 ;
		:_Format = "netCDF-4" ;
data:

 tas = 5.299805, 6.200195, 7.299805, -0 ;
}
``

I am using the following versions 

hdf5 1.14.3 nompi_h4f84152_100 conda-forge libnetcdf 4.9.2 nompi_h9612171_113 conda-forge netcdf4 1.6.5 nompi_py312h26027e0_100 conda-forge

garciampred avatar Mar 20 '24 09:03 garciampred

not sure you should expect the quantization to leave nans alone, unless the _FillValue or missing_value is set to nan

jswhit avatar Mar 21 '24 01:03 jswhit

it turns out that even if the _FillValue is set to nan, the same problem occurs.

however, you can use

data = np.ma.masked_invalid(np.array([5.3, 6.2, 7.3, np.nan]))

and you end up with

netcdf test {
dimensions:
	x = 4 ;
variables:
	float tas(x) ;
		tas:_QuantizeGranularBitRoundNumberOfSignificantDigits = 4 ;
data:

 tas = 5.299805, 6.200195, 7.299805, _ ;
}

jswhit2 avatar Mar 22 '24 16:03 jswhit2

did you report this upstream?

jswhit avatar Apr 10 '24 22:04 jswhit

First heard about this bug report in February, 2025. Patch to fix was merged into netCDF a week later: https://github.com/Unidata/netcdf-c/pull/3093 and is expected to be in the netCDF 4.10.0 release. You will see that all quantization methods, including Granular BitRound, now preserve NaNs and -0.0's in the input:

zender@spectral:~/data/rvw$ ncks --lbr
Linked to netCDF library version 4.10.0-development compiled Feb 15 2025 09:12:46
zender@spectral:~/data/rvw$ ncks -C -v ppc_zro_ngt_nan_flt ~/nco/data/in.nc
netcdf in {
  dimensions:
    time = UNLIMITED ; // (10 currently)

  variables:
    float ppc_zro_ngt_nan_flt(time) ;
      ppc_zro_ngt_nan_flt:long_name = "array of single precision floating point NaNs and negative zeros" ;

  data:
    ppc_zro_ngt_nan_flt = -0, 0, NaNf, -0, 0, NaNf, -0, -0, NaNf, 3.141593 ;

} // group /
zender@spectral:~/data/rvw$ ncks -O -7 -L 1 -C -v ppc_zro_ngt_nan_flt --qnt_alg=gbr --qnt default=3 --cmp='shf|zst' /Users/zender/nco/data/in.nc /Users/zender/foo.nc
zender@spectral:~/data/rvw$ ncks -C -v ppc_zro_ngt_nan_flt ~/foo.nc
netcdf foo {
  dimensions:
    time = UNLIMITED ; // (10 currently)

  variables:
    float ppc_zro_ngt_nan_flt(time) ;
      ppc_zro_ngt_nan_flt:long_name = "array of single precision floating point NaNs and negative zeros" ;
      ppc_zro_ngt_nan_flt:quantization = "quantization_info" ;
      ppc_zro_ngt_nan_flt:quantization_nsd = 3 ;

  data:
    ppc_zro_ngt_nan_flt = -0, 0, NaNf, -0, 0, NaNf, -0, -0, NaNf, 3.140625 ;

} // group /
zender@spectral:~/data/rvw$ 

czender avatar Mar 22 '25 23:03 czender

@jswhit2 and @garciampred Please LMK if you experience any quantization issues in the patched upstream netCDF library, now known as libnetcdf version 4.10.0-development.

czender avatar Mar 24 '25 15:03 czender