netcdf4-python
netcdf4-python copied to clipboard
accessing unset entries in VLEN variable causes crash
I created a VLEN variable for numpy.float64 like this with the first (and only) dimension unlimited.
<type 'netCDF4._netCDF4.Variable'>
vlen test(mydim1)
var_type: numpy.float64
vlen data type: float64
unlimited dimensions: mydim1
current shape = (3,)
Currently the dimension mydim has length 3 although I have only set values for entries 0 and 1 like
var = ncfile.variables['test']
var[0] = np.array([1.0, ...])
var[1] = np.array([0.1, ...])
when I try to load the content from var[3] this gives me the expected KeyError since there are only 3 and not 4 elements in the variable stored. However doing this
print var[2]
>>> Crash
will crash my ipython kernel. The same is true for slicing
print var[:]
>>> Crash
I am not sure what the intended behavior is for reading unset values in VLEN variables and if this is also a problem in the wrapped netCDF library.
You definitely shouldn't get a crash. Does this simple example work for you?
from netCDF4 import Dataset
import numpy as np
nc = Dataset('test.nc','w')
vlen_type = nc.createVLType(np.float,'vltest')
nc.createDimension('x',None)
v = nc.createVariable('vl',vlen_type,'x')
v[0]=np.arange(2,dtype=np.float)
v[1]=np.arange(3,dtype=np.float)
print v[:]
print v[2]
nc.close()
You should get
[array([ 0., 1.]) array([ 0., 1., 2.])]
Traceback (most recent call last):
File "test_vlen.py", line 10, in <module>
print v[2]
File "_netCDF4.pyx", line 3646, in netCDF4._netCDF4.Variable.__getitem__ (netCDF4/_netCDF4.c:32443)
File "_netCDF4.pyx", line 4398, in netCDF4._netCDF4.Variable._get (netCDF4/_netCDF4.c:40825)
IndexError
Yes, thanks. This works fine. I played around a little to find a simple example that fails.
This does not work. I created two variables using the same dimension. First the fixed length was used to increase the current length to 3 and then add the VLEN variable. I am using the conda to install netcdf. Current version is
netcdf4 1.2.2 np110py27_0 defaults
netcdf
from netCDF4 import Dataset
import numpy as np
nc = Dataset('test.nc','w')
vlen_type = nc.createVLType(np.float64,'vltest')
nc.createDimension('x', None)
v = nc.createVariable('vl', vlen_type, 'x')
w = nc.createVariable('vl2', np.float64, 'x')
w[0:3] = np.arange(3,dtype=np.float64)
v[0]=np.arange(200000,dtype=np.float64)
v[1]=np.arange(3000000,dtype=np.float64)
print v[2]
print v[:]
nc.close()
Confirmed. Seems like a C library issue though. If you comment out the print statements in your example, it does not crash. However, running ncdump on the resulting file does segfault. Seems like what we need is a C version of this code that triggers the segfault. Once we have that, a netcdf-c issue can be opened.
Perhaps it would be sufficient to attach the netcdf file generated by the example code above to a netcdf-c issue.
cc: @WardF
Thanks, @dopplershift. This had slipped beneath my ra.. my notice. The file that causes the crash would be sufficient for tracking down the issue and I can probably craft a C program if need be.
As you can see in the timeline above, I've opened an issue to track this on the netcdf-c end. I'll report back in here when the issue is fixed, or you can track progress over at https://github.com/Unidata/netcdf-c/issues/221.
Ok, there is/was a logic flaw when using an NC_UNLIMITED dimension with a VLEN data type. More testing is needed to ensure I haven't added a different bug, but I'm somewhat optimistic at this point (the other tests are passing). I need to:
- [ ] Wire new tests into cmake and autotools.
- [ ] Expand the test beyond the basic case.
I won't jinx things by forecasting a timeframe for the fix, but I want to close this ASAP, this week if at all possible.
The fix for the related netcdf-c issue has been merged into netcdf-c:master; see https://github.com/Unidata/netcdf-c/pull/224 for details. The issue was in reading the file, not writing it, so the same test file can be used with the new version. I've closed the related netcdf-c issue, but will reopen if need be; It's plausible I've missed an edge case. Thanks again for the letting me know about the issue!
It's been a while, but thank you guys for taking care of this!
@jhprinz Does that mean this issue is fixed for you now?
@dopplershift A quick update: I just rechecked: With conda install netcdf4=1.2.2 and libnetcdf=4.3.3.1 is fails already at the first print statement print v[2].
After the update to conda install netcdf4=1.2.4 and libnetcdf=4.4.1 it still fails but only at the second print statement print v[:]
Currently I do not need this feature so I consider this closed from my side, but I guess this is still not the desire behaviour. I am happy to help if I can to sort this out though.
I currently still use 1.2.2 because of some chunking related problems, but there is already an issue on that one.
Ok, now I'm running into this, though slightly differently:
from netCDF4 import Dataset
import numpy as np
nc = Dataset('test.nc', 'w')
vlen_type = nc.createVLType(np.float64, 'vltest')
nc.createDimension('x', None)
v = nc.createVariable('vl', vlen_type, 'x')
w = nc.createVariable('vl2', np.float64, 'x')
w[0:3] = np.arange(3, dtype=np.float64)
print(v[0]) # prints '[]' or sometimes crashes
print(v[0].tolist()) # prints '[]' or sometimes crashes
print(v[0].size) # BOOM!
nc.close()
At the terminal:
python(5287,0x7fffa21c83c0) malloc: *** error for object 0xf000000010dc9c88: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Here's what I'm running:
libnetcdf 4.4.1 0 conda-forge
netcdf4 1.2.4 np111py35_2 conda-forge
I'm open to suggestions here. My use case is to be able to append to a vlen variable, so I kind of need to be able to access what's there already. 😀
Seems like there may be some lingering issues in the netcdf-c vlen code. Would be nice to have a c-program that triggers the crash.
Just a reminder that via netcdf-c, one cannot modify the length of a vlen without rewriting the top-level variable.
@dopplershift Please see the pull request #605 for a possible fix. (I noticed that vldata allocated by Variable._get(...) may still be uninitialized when it's passed to nc_free_vlens(), which may result in calling free with a random argument.)
Pull request #605 fixes @dopplershift's test script for me. Thanks @ckhroulev!