netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

Double Free or Corruption with large datasets and slicing

Open hellkite500 opened this issue 7 years ago • 4 comments

I have a large dataset with a variable I'm trying to read.

dataset = netCDF4.Dataset(state)
var = dataset[flow]
print var

Shows the variable correctly:

<type 'netCDF4._netCDF4.Variable'>
float64 channelSurfacewaterChannelNeighborsFlowRate(instances, channelElements, channelChannelNeighborsSize)
    units: meters^3/second
    comment: Instantaneous flow rate of surfacewater between a channel element and its channel neighbor.  Positive means flow out of the element into the neighbor.  Negative means flow into the element out of the neighbor.
unlimited dimensions: instances, channelElements
current shape = (105121, 17587, 8)
filling on, default _FillValue of 9.96920996839e+36 used

But then I try to read in a slice of the data, and glibc complains about a double free or corruption error.

flows = var[0:100, 100, [0,1]]

gives the following error:

*** glibc detected *** python: double free or corruption (!prev): 0x0000000003c9d400 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3e65275f3e]
/lib64/libc.so.6[0x3e65278dd0]
/project/CI-WATER/tools/CI-WATER-tools/lib/python2.7/site-packages/numpy/core/multiarray.so(+0x1dd5f)[0x2ad3694ffd5f]
/project/CI-WATER/tools/CI-WATER-tools/lib/python2.7/site-packages/numpy/core/multiarray.so(+0x20c6e)[0x2ad369502c6e]
/project/CI-WATER/tools/CI-WATER-tools/lib/python2.7/site-packages/netCDF4/_netCDF4.so(+0xc0a8c)[0x2ad3648a6a8c]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2d8b)[0x2ad35dd1438b]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x88e)[0x2ad35dd18a9e]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5652)[0x2ad35dd16c52]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x88e)[0x2ad35dd18a9e]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x2ad35dd18bb2]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(PyRun_FileExFlags+0xb0)[0x2ad35dd38850]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0xef)[0x2ad35dd38a2f]
/project/CI-WATER/tools/CI-WATER-tools/lib/libpython2.7.so.1.0(Py_Main+0xc74)[0x2ad35dd4e194]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3e6521ed1d]
python[0x400649]
======= Memory map: ========
00400000-00401000 r-xp 00000000 00:1a 282244272                          /pfs/bh/proj/CI-WATER/tools/CI-WATER-tools/bin/python2.7
00600000-00601000 rw-p 00000000 00:1a 282244272                          /pfs/bh/proj/CI-WATER/tools/CI-WATER-tools/bin/python2.7
01ef4000-03cc3000 rw-p 00000000 00:00 0                                  [heap]
32bec00000-32bec34000 r-xp 00000000 08:03 36179744                       /usr/lib64/libfontconfig.so.1.4.4
32bec34000-32bee34000 ---p 00034000 08:03 36179744                       /usr/lib64/libfontconfig.so.1.4.4
32bee34000-32bee36000 rw-p 00034000 08:03 36179744                       /usr/lib64/libfontconfig.so.1.4.4
389d400000-389d5ba000 r-xp 00000000 08:03 36181325                       /usr/lib64/libcrypto.so.1.0.1e
389d5ba000-389d7b9000 ---p 001ba000 08:03 36181325                       /usr/lib64/libcrypto.so.1.0.1e
389d7b9000-389d7d4000 r--p 001b9000 08:03 36181325                       /usr/lib64/libcrypto.so.1.0.1e
389d7d4000-389d7e0000 rw-p 001d4000 08:03 36181325                       /usr/lib64/libcrypto.so.1.0.1e
389d7e0000-389d7e4000 rw-p 00000000 00:00 0 
39a2a00000-39a2a04000 r-xp 00000000 08:03 10616895                       /lib64/libuuid.so.1.3.0
39a2a04000-39a2c03000 ---p 00004000 08:03 10616895                       /lib64/libuuid.so.1.3.0
39a2c03000-39a2c04000 rw-p 00003000 08:03 10616895                       /lib64/libuuid.so.1.3.0
39a2e00000-39a2e07000 r-xp 00000000 08:03 36175907                       /usr/lib64/libSM.so.6.0.1
39a2e07000-39a3007000 ---p 00007000 08:03 36175907                       /usr/lib64/libSM.so.6.0.1
39a3007000-39a3008000 rw-p 00007000 08:03 36175907                       /usr/lib64/libSM.so.6.0.1
3b3c600000-3b3c737000 r-xp 00000000 08:03 36179896                       /usr/lib64/libX11.so.6.3.0
3b3c737000-3b3c937000 ---p 00137000 08:03 36179896                       /usr/lib64/libX11.so.6.3.0
3b3c937000-3b3c93d000 rw-p 00137000 08:03 36179896                       /usr/lib64/libX11.so.6.3.0
3b3ca00000-3b3ca24000 r-xp 00000000 08:03 36177120                       /usr/lib64/libxcb.so.1.1.0
3b3ca24000-3b3cc24000 ---p 00024000 08:03 36177120                       /usr/lib64/libxcb.so.1.1.0
3b3cc24000-3b3cc25000 rw-p 00024000 08:03 36177120                       /usr/lib64/libxcb.so.1.1.0
3b3ce00000-3b3ce11000 r-xp 00000000 08:03 36176422                       /usr/lib64/libXext.so.6.4.0
3b3ce11000-3b3d011000 ---p 00011000 08:03 36176422                       /usr/lib64/libXext.so.6.4.0
3b3d011000-3b3d012000 rw-p 00011000 08:03 36176422                       /usr/lib64/libXext.so.6.4.0
3b3d200000-3b3d209000 r-xp 00000000 08:03 36176623                       /usr/lib64/libXrender.so.1.3.0
3b3d209000-3b3d408000 ---p 00009000 08:03 36176623                       /usr/lib64/libXrender.so.1.3.0
3b3d408000-3b3d409000 rw-p 00008000 08:03 36176623                       /usr/lib64/libXrender.so.1.3.0
3b43600000-3b436db000 r-xp 00000000 08:03 10616925                       /lib64/libkrb5.so.3.3
3b436db000-3b438db000 ---p 000db000 08:03 10616925                       /lib64/libkrb5.so.3.3
3b438db000-3b438e5000 r--p 000db000 08:03 10616925                       /lib64/libkrb5.so.3.3
3b438e5000-3b438e7000 rw-p 000e5000 08:03 10616925                       /lib64/libkrb5.so.3.3
3b43a00000-3b43a41000 r-xp 00000000 08:03 10616965                       /lib64/libgssapi_krb5.so.2.2
3b43a41000-3b43c41000 ---p 00041000 08:03 10616965                       /lib64/libgssapi_krb5.so.2.2
3b43c41000-3b43c42000 r--p 00041000 08:03 10616965                       /lib64/libgssapi_krb5.so.2.2
3b43c42000-3b43c44000 rw-p 00042000 08:03 10616965                       /lib64/libgssapi_krb5.so.2.2
3b43e00000-3b43e62000 r-xp 00000000 08:03 36186339                       /usr/lib64/libssl.so.1.0.1e
3b43e62000-3b44062000 ---p 00062000 08:03 36186339                       /usr/lib64/libssl.so.1.0.1e
3b44062000-3b44066000 r--p 00062000 08:03 36186339                       /usr/lib64/libssl.so.1.0.1e
3b44066000-3b4406d000 rw-p 00066000 08:03 36186339                       /usr/lib64/libssl.so.1.0.1e
3e64e00000-3e64e20000 r-xp 00000000 08:03 10616839                       /lib64/ld-2.12.so
3e6501f000-3e65021000 r--p 0001f000 08:03 10616839                       /lib64/ld-2.12.so
3e65021000-3e65022000 rw-p 00021000 08:03 10616839                       /lib64/ld-2.12.so
3e65022000-3e65023000 rw-p 00000000 00:00 0 
3e65200000-3e6538a000 r-xp 00000000 08:03 10616849                       /lib64/libc-2.12.so
3e6538a000-3e6558a000 ---p 0018a000 08:03 10616849                       /lib64/libc-2.12.so
3e6558a000-3e6558e000 r--p 0018a000 08:03 10616849                       /lib64/libc-2.12.so
3e6558e000-3e65590000 rw-p 0018e000 08:03 10616849                       /lib64/libc-2.12.so
3e65590000-3e65594000 rw-p 00000000 00:00 0 
3e65600000-3e65602000 r-xp 00000000 08:03 10616876                       /lib64/libdl-2.12.so
3e65602000-3e65802000 ---p 00002000 08:03 10616876                       /lib64/libdl-2.12.soAborted (core dumped)

I also tried some other index slices, and concluded that

flows = var[0:X ,100, [0, 1]]

doesn't throw this error for X < 64, but for 64 or more, it leads to this double free error. Also, if I change the second index, e.g.

flows = var[0:100, 0, [0,1]]

I also don't get an error.

I first encountered this with the netCDF4 1.2.4 package, and upgraded to netcdf4-1.2.7 (with numpy-1.12.1)

Additional information from the netCDF4 module:

__has_nc_inq_format_extended__ = 1
__has_nc_inq_path__ = 1
__has_rename_grp__ = 1
__hdf5libversion__ = '1.8.18'
__netcdf4libversion__ = u'4.4.1'
__version__ = '1.2.7'

hellkite500 avatar May 16 '17 22:05 hellkite500

What platform? I believe was bug with big datasets on windows in the netCDF C library version 4.4.1, which was fixed in 4.4.1.1.

dopplershift avatar May 16 '17 23:05 dopplershift

This is on a RHEL 6.8 machine.

hellkite500 avatar May 16 '17 23:05 hellkite500

Can you post the offending file somewhere, or at least a subset of it that is enough to trigger the error?

jswhit avatar May 17 '17 23:05 jswhit

I'll see what I can do to whittle down the file size and still produce the error. When I get that accomplished, I'll find some way to get it to you, though it could still be fairly large (current file sizes range from several hundred GB to TB's).

hellkite500 avatar May 18 '17 00:05 hellkite500