netcdf4-python
netcdf4-python copied to clipboard
Cannot create nested compound types when creating netCDF file in memory
I've run into a problem when using the relatively new functionality of creating a netCDF4 file in memory, then writing the memory to a file on closure of the netCDF file. It seems that it's not possible to use nested compound types - i.e. a compound type that contains another compound type.
The code below illustrates the problem.
When using the first method in the code an invalid netCDF file is created. Using ncdump on the file results in the error:
ncdump: test_raw_nc.nc: test_raw_nc.nc: NetCDF: HDF error
from netCDF4._netCDF4 import Dataset
import numpy as np
res_deg = 0.5
shape=[365, 60, 180/res_deg, 360/res_deg]
################################################################################
# this block of code does not create a valid netCDF file
# writing the nested compound types to a netCDF file in memory, then writing the
# memory to disk
ncd = Dataset(
"inmemory.nc", mode='w',
format="NETCDF4", memory=0
)
# create a Subarray datatype
Subarray = np.dtype([("ncvar", "b", 256),
("file", "S2048"),
("format", "S16"),
("shape", "i4", (4, ))])
# create a partition datatype
Partition = np.dtype([("index", "i4", (4, )),
("location", "i4", (4, 2)),
("subarray", Subarray)
])
subarray_type = ncd.createCompoundType(Subarray, "Subarray")
part_type = ncd.createCompoundType(Partition, "Partition")
ncd.history = "Test of CFA-0.5 format"
byt = ncd.close()
fh = open("test_raw_nc.nc", "wb")
fh.write(byt)
fh.close()
################################################################################
# this block of code does create a valid netCDF file
# writing the nested compound types straight to the disk
ncd = Dataset(
"test_file_nc.nc", mode='w',
format="NETCDF4"
)
# create a Subarray datatype
Subarray = np.dtype([("ncvar", "b", 256),
("file", "S2048"),
("format", "S16"),
("shape", "i4", (4, ))])
# create a partition datatype
Partition = np.dtype([("index", "i4", (4, )),
("location", "i4", (4, 2)),
("subarray", Subarray)
])
subarray_type = ncd.createCompoundType(Subarray, "Subarray")
part_type = ncd.createCompoundType(Partition, "Partition")
ncd.history = "Test of CFA-0.5 format"
ncd.close()
################################################################################
# as a compromise, this block of code does create a valid netCDF file
# writing non-nested compound types out to memory, then writing the memory to
# disk
ncd = Dataset(
"inmemory.nc", mode='w',
format="NETCDF4", memory=0
)
# create a partition and subarray datatype all in one
Partition = np.dtype([("index", "i4", (4, )),
("location", "i4", (4, 2)),
("ncvar", "b", 256),
("file", "S2048"),
("format", "S16"),
("shape", "i4", (4, ))
])
part_type = ncd.createCompoundType(Partition, "Partition")
ncd.history = "Test of CFA-0.5 format"
byt = ncd.close()
fh = open("test_raw_2_nc.nc", "wb")
fh.write(byt)
fh.close()
################################################################################
This and #971 are almost certainly issues with the C library. The python interface for in-memory datasets is just a very thin wrapper on the C library calls.
Thanks @jswhit. I've recreated the errors in C and submitted issues for both of these to the Unidata netcdf-c repo.
https://github.com/Unidata/netcdf-c/issues/1489