netcdf4-python
netcdf4-python copied to clipboard
HDF5 error infinite loop closing library
Hi,
Using the lastest version of netcdf4-python/NetCDF/HDF5 if I create a NetCDF file using the code below it works, but when I try to ncdump it I get this error:
ncdump: toto.nc: NetCDF: HDF error HDF5: infinite loop closing library D,G,A,S,T,F,FD,P,FD,P,FD,P,E,E,SL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL
(with latest version of HDF5 and NetCDF libraries I get only "ncdump: toto.nc: NetCDF: HDF error")
and it seems very weird because if I comment one of the lines creating dim1, attr1 or attr3, or if I change the content/length of attributes then ncdump works...
from netCDF4 import Dataset
dst = Dataset('toto.nc', 'w', clobber=True, format='NETCDF4_CLASSIC')
dst.createDimension('dim1', 1)
dst_var = dst.createVariable('var', 'float32')
dst_var.setncattr('attr1', 'a' * 199)
dst_var.setncattr('attr2', ','.join(['123456'] * 4328 + ['12345678'] * 3908))
dst_var.setncattr('attr3', ','.join(['b' * 28] * 8236))
In fact I get this error with a program which reformat several thousands of NetCDF files and I get this error for only 3 of them, and I tried to extract a minimal example to reproduce the bug.
Problem also reproduced using Ubuntu 14.04.5 LTS official NetCDF/HDF5 packages, or some other local versions more recent.
I also tried to reproduce the problem using only C code, but the program below works, so I think the problem is in the Python side:
#include <string.h>
#include "netcdf.h"
int main(int argc, char **argv) {
int i, j, ncid, dimid, varid;
nc_create("toto2.nc", NC_CLASSIC_MODEL | NC_NETCDF4 | NC_CLOBBER, &ncid);
nc_def_dim(ncid, "dim1", 1, &dimid);
nc_def_var(ncid, "var", NC_FLOAT, 0, NULL, &varid);
char s[(int) 1e6];
for (i = 0; i < 199; i++)
s[i] = 'a';
s[i] = '\0';
nc_put_att_text(ncid, varid, "attr1", strlen(s), s);
i = 0;
for (j = 0; j < 4328; j++) {
strcpy(s + i, "123456,");
i += strlen("123456,");
}
for (j = 0; j < 3908; j++) {
strcpy(s + i, "12345678,");
i += strlen("12345678,");
}
s[i - 1] = '\0';
nc_put_att_text(ncid, varid, "attr2", strlen(s), s);
i = 0;
for (j = 0; j < 8236; j++) {
strcpy(s + i, "bbbbbbbbbbbbbbbbbbbbbbbbbbbb,");
i += strlen("bbbbbbbbbbbbbbbbbbbbbbbbbbbb,");
}
s[i - 1] = '\0';
nc_put_att_text(ncid, varid, "attr3", strlen(s), s);
nc_enddef(ncid);
nc_close(ncid);
}
Since Python uses the C API, and only the C API, to write the file, there should not be a way to create a file that does this. (At least that's my thinking.) @WardF ?
If setncattr_string is used (so that nc_put_att_string is used), and the format is changed to NETCDF, it appears to work. So, it looks to be related to nc_put_att_text.
I'm not sure that this proves that the problem is related to nc_put_att_text, because if I comment the dimension creation, or if I change the variable to use the dimension instead of to be a scalar, it works too. This erratic behavior let me think that it looks like a memory corruption. Maybe a memory checking with valgrind could help. Another idea: maybe the file is not corrupted but ncdump has a bug on it?
h5dump also gives the 'infinite loop' error, so I don't think it's specific to ncdump.
interesting that it also works if you change the format to 'NETCDF3' or 'NETCDF4', ncdump only chokes if the format is 'NETCDF4_CLASSIC'.