netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

segfault after update: H5D__btree_decode_key: Assertion `0 == (tmp_offset % layout->dim[u])' failed

Open marcosrdac opened this issue 4 years ago • 7 comments

NetCDF4: v 1.5.5.1 Python: v3.8.7 OS: Arch Linux

I am working with 2D satellite data in form of NetCDF4 files. I was previously able to open every image product I had back in version 1.5.3 (btw, these images were exported using the same process in ESA Snap Toolboxes).

After upgrading NetCDF4 library to version 1.5.5.1, I cannot visualize many of the data files (roughly speaking, half of them). In such cases I get a segfault:

python: H5Dbtree.c:695: H5D__btree_decode_key: Assertion `0 == (tmp_offset % layout->dim[u])' failed.
zsh: abort (core dumped)  python segmentation.py

A simple code like this generates the error on many files.

from os import listdir
from os.path import join
import netCDF4 as nc
import numpy as np


folder = 'original'
ls = [join(folder, f) for f in listdir(folder) if f.endswith('.nc')]

band_choices = {
    'Sigma0_IW1_VV_db',
    'Sigma0_IW2_VV_db',
    'Intensity_IW2_VV_db',
    'Sigma0_VV_db',
    'Sigma0_db',
}

for f in ls:
    ncd = nc.Dataset(f)
    for band in band_choices:
        if band in ncd.variables:
            img = ncd.variables[band]
        print(img.shape)
        print(img.size)
        print(img[:])  # this line causes the error on v1.5.5.1

Reinstalling the old version (v1.5.3) solves my problem by now, but I really want to understand what is different now for a decisive fix.

Thanks beforehand.

marcosrdac avatar Jan 21 '21 12:01 marcosrdac

nothing in the python interface changed that would have caused this - must be due to the hdf5 version that is linked. Can you run the 'checkversion.py' script the source directory and post what it reports for v1.5.3 and v.1.5.5.1 on your system?

jswhit avatar Jan 21 '21 14:01 jswhit

Also, you could try closing the Dataset at the end of the loop (ncd.close)

jswhit avatar Jan 21 '21 14:01 jswhit

nothing in the python interface changed that would have caused this - must be due to the hdf5 version that is linked. Can you run the 'checkversion.py' script the source directory and post what it reports for v1.5.3 and v.1.5.5.1 on your system?

The two outputs are:

netcdf4-python version: 1.5.3
HDF5 lib version:       1.10.4
netcdf lib version:     4.6.3
numpy version           1.19.5
netcdf4-python version: 1.5.5.1
HDF5 lib version:       1.12.0
netcdf lib version:     4.7.4
numpy version           1.19.5

Indeed the HDF5 library versions differ. I did thought the error might be there for the last changelogs did not mention related changes.

Also...

Also, you could try closing the Dataset at the end of the loop (ncd.close)

Thanks, it did not made my code work in 1.5.5.1, but it looks like a good practice to do so.

marcosrdac avatar Jan 21 '21 14:01 marcosrdac

Did you build from source, install a binary wheel with pip, or are you using conda?

jswhit avatar Jan 21 '21 15:01 jswhit

I installed it with pip (pip install netcdf4).

marcosrdac avatar Jan 21 '21 15:01 marcosrdac

OK, then I guess the hdf5 library installed in that binary wheel doesn't play nice with your system. The only thing I can suggest at this point is to install hdf5 and netcdf-c separately and then build netcdf4-python from source against those installed libs, or use conda.

jswhit avatar Jan 21 '21 16:01 jswhit

there also seems to be a netcdf4-python package for your flavor of linux

https://archlinux.org/packages/community/x86_64/python-netcdf4/

jswhit avatar Jan 21 '21 16:01 jswhit