netcdf-c icon indicating copy to clipboard operation
netcdf-c copied to clipboard

ncdump DAP4 response fails to handle the attribute name that contains special characters

Open kyang2014 opened this issue 9 months ago • 4 comments

When using "ncdump -h" to access the following file via DAP4 with dmrpp: http://test.opendap.org/opendap/GESDISC/S5P_NRTI_L2__O3_TCL_20250320T235705_20250324T222325_38546_03_020701_20250325T000000.nc

It issues the following error(Note: the dmrpp after .nc):

./ncdump -h dap4://test.opendap.org/opendap/GESDISC/S5P_NRTI_L2__O3_TCL_20250320T235705_20250324T222325_38546_03_020701_20250325T000000.nc.dmrpp
(d4meta.c:410) ./ncdump: dap4://test.opendap.org/opendap/GESDISC/S5P_NRTI_L2__O3_TCL_20250320T235705_20250324T222325_38546_03_020701_20250325T000000.nc.dmrpp: NetCDF: Name contains illegal characters

However, the ncdump can access the netCDF-4 file in the local file system. Further investigation shows that this attribute name":gmd:title" causes the issue.

Note: netCDF-C DAP4 doesn't like the ":"before gmd:title.

After I rename this attribute name from ":gmd:title" to "_gmd:title", I can use netCDF-C DAP4 to dump the header of this file. See this:

 ./ncdump -h dap4://test.opendap.org/opendap/GESDISC/S5P_NRTI_L2__O3_TCL_20250320T235705_20250324T222325_38546_03_020701_20250325T000000.nc.mod_attr.dmrpp

I am using netCDF-4.9.0 and also the latest check-out from github(4.10.0-development).

kyang2014 avatar Mar 31 '25 16:03 kyang2014

Ok, the test for this is in libdispatch/dstring,c. It tests that names begin with a-z | A-z | 0-9 | _ But I cannot find out where this is documented anywhere. However, a google search for "legal netcf attribute names" does show this in the AI generated answer, so I assume it is documented somewhere. I am experimenting with lifting this restriction, but I suspect that it will fail on ncgen parsing.

DennisHeimbigner avatar Mar 31 '25 21:03 DennisHeimbigner

Ok, as I suspected, it fails on ncgen. I need to make sure the limitation is documented, but changing is is probably not worth it.

DennisHeimbigner avatar Mar 31 '25 22:03 DennisHeimbigner

Hi Dennis @DennisHeimbigner - The restriction to netCDF object names is documented in the NUG in the netCDF Classic Format section and elsewhere (though I don't think the other locations are up-to-date with the BNF). The restriction was greatly relaxed when support for UTF-8 was added with netCDF-C 3.6.3 and 4.0. However, it doesn't appear that the relaxed rules support a colon (':') as an initial character.

On the other hand, the relaxed rules allow for huge numbers of new Unicode punctuation characters so I'm not sure the restriction on an initial colon (and other ASCII punctuations) still makes sense. But that is probably something for a separate discussion.

ethanrd avatar Apr 01 '25 19:04 ethanrd

I appears that a leading colon would cause problems for ncgen parsing. But of course we could precede it with a backslash escape character.

DennisHeimbigner avatar Apr 01 '25 20:04 DennisHeimbigner