netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

2D string variable update causes HDF Error when rereading file

Open robin-cls opened this issue 1 year ago • 6 comments

Hello,

I encountered a problem when trying to save a string variable. The goal is to update part of a dataset, so I am using a slice to select relevant part of a 2D-string table, and then assign the new values. While it works well for integer and floating variables, the 'partial' update of a string variable does not go well and raises an HDFError when rereading (see the image after the reproducing steps).

The problem might be linked to the combination of the extensible dimension feature and the 2D case because:

  • replacing the dim_0=None by dim_0=5 --> OK
  • Partial update of 1D string variable --> OK

Here are the steps to reproduce:

with netCDF4.Dataset('broken.nc', mode='w') as handler:
    handler.createDimension("dim_0", None)
    handler.createDimension("dim_1", 5)
    handler.createVariable('var_str', str, ('dim_0', 'dim_1'), fill_value='no_data')
    
    handler["var_str"][2:5, 1:4] = np.full((3, 3), fill_value='foo', dtype=object)

# Error appears when triggering a netcdf close. Something might be getting corrupted somewhere
with netCDF4.Dataset('broken.nc', mode='r') as handler:
    print(handler["var_str"][...])

image

I work in a Conda environment installed on RHEL8 with : python=3.11 h5netcdf=1.2.0 libnetcdf=4.9.2 netcdf4=1.7.1

robin-cls avatar Sep 25 '24 16:09 robin-cls

since it works if you used fixed dimensions, it's likely a bug in the netcdf-c lib

jswhit avatar Sep 26 '24 01:09 jswhit

I recently got similar errors, too. They appeared when I attempted to upgrade the netcdf4 version to 1.7.1. Before, my code was running on version 1.5.8 without any errors.

pjpetersik avatar Sep 26 '24 07:09 pjpetersik

netcdf4-python 1.5.8 wheels used an earlier version of the C lib (nothing in the python interface for vlen str variables has changed)

jswhit avatar Sep 26 '24 14:09 jswhit

Should I reopen this issue in the netcdf-c repository instead ?

robin-cls avatar Sep 26 '24 16:09 robin-cls

I think that would be a good idea - especially if you could translate your example into C and include that in the github issue.

jswhit avatar Sep 26 '24 19:09 jswhit

you don't have to close this issue - just link it to the one in the netcdf-c repo

jswhit avatar Sep 26 '24 19:09 jswhit