netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

Cannot create nested compound types when creating netCDF file in memory

Open nmassey001 opened this issue 6 years ago • 3 comments

I've run into a problem when using the relatively new functionality of creating a netCDF4 file in memory, then writing the memory to a file on closure of the netCDF file. It seems that it's not possible to use nested compound types - i.e. a compound type that contains another compound type.

The code below illustrates the problem.

When using the first method in the code an invalid netCDF file is created. Using ncdump on the file results in the error: ncdump: test_raw_nc.nc: test_raw_nc.nc: NetCDF: HDF error

from netCDF4._netCDF4 import Dataset
import numpy as np

res_deg = 0.5

shape=[365, 60, 180/res_deg, 360/res_deg]

################################################################################
# this block of code does not create a valid netCDF file
# writing the nested compound types to a netCDF file in memory, then writing the
# memory to disk

ncd = Dataset(
    "inmemory.nc", mode='w',
    format="NETCDF4", memory=0
)

# create a Subarray datatype
Subarray = np.dtype([("ncvar", "b", 256),
                     ("file", "S2048"),
                     ("format", "S16"),
                     ("shape", "i4", (4, ))])

# create a partition datatype
Partition = np.dtype([("index", "i4", (4, )),
                      ("location", "i4", (4, 2)),
                      ("subarray", Subarray)
                     ])

subarray_type = ncd.createCompoundType(Subarray, "Subarray")
part_type = ncd.createCompoundType(Partition, "Partition")

ncd.history = "Test of CFA-0.5 format"
byt = ncd.close()

fh = open("test_raw_nc.nc", "wb")
fh.write(byt)
fh.close()

################################################################################
# this block of code does create a valid netCDF file
# writing the nested compound types straight to the disk

ncd = Dataset(
    "test_file_nc.nc", mode='w',
    format="NETCDF4"
)

# create a Subarray datatype
Subarray = np.dtype([("ncvar", "b", 256),
                     ("file", "S2048"),
                     ("format", "S16"),
                     ("shape", "i4", (4, ))])

# create a partition datatype
Partition = np.dtype([("index", "i4", (4, )),
                      ("location", "i4", (4, 2)),
                      ("subarray", Subarray)
                     ])

subarray_type = ncd.createCompoundType(Subarray, "Subarray")
part_type = ncd.createCompoundType(Partition, "Partition")

ncd.history = "Test of CFA-0.5 format"
ncd.close()

################################################################################
# as a compromise, this block of code does create a valid netCDF file
# writing non-nested compound types out to memory, then writing the memory to
# disk

ncd = Dataset(
    "inmemory.nc", mode='w',
    format="NETCDF4", memory=0
)

# create a partition and subarray datatype all in one
Partition = np.dtype([("index", "i4", (4, )),
                      ("location", "i4", (4, 2)),
                      ("ncvar", "b", 256),
                      ("file", "S2048"),
                      ("format", "S16"),
                      ("shape", "i4", (4, ))
                     ])

part_type = ncd.createCompoundType(Partition, "Partition")

ncd.history = "Test of CFA-0.5 format"
byt = ncd.close()

fh = open("test_raw_2_nc.nc", "wb")
fh.write(byt)
fh.close()

################################################################################

nmassey001 avatar Sep 16 '19 13:09 nmassey001

This and #971 are almost certainly issues with the C library. The python interface for in-memory datasets is just a very thin wrapper on the C library calls.

jswhit avatar Sep 16 '19 16:09 jswhit

Thanks @jswhit. I've recreated the errors in C and submitted issues for both of these to the Unidata netcdf-c repo.

nmassey001 avatar Sep 17 '19 12:09 nmassey001

https://github.com/Unidata/netcdf-c/issues/1489

jswhit avatar Sep 17 '19 12:09 jswhit