netcdf-c icon indicating copy to clipboard operation
netcdf-c copied to clipboard

Issue with ncdump and DAP4?

Open ndp-opendap opened this issue 9 months ago • 17 comments

netcdf library version 4.9.0 of Mar 18 2025 16:42:02 $

Dataset URL: http://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5

Here's the ncdump interaction:

ncdump -h "dap4://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5"
(d4meta.c:499) Error:Inferred type name conflict
ncdump: dap4://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5: NetCDF: String match to name in use

Downloading either the native granule or asking for the service to rewrite the granule as netcdf-4 produces files that ncdump can traverse.

ndp-opendap avatar Mar 19 '25 17:03 ndp-opendap

I will test this, anything leap out at you @DennisHeimbigner ?

WardF avatar Mar 19 '25 23:03 WardF

@lesserwhirls tested this file with the TDS and got the same result (I think) Is that the case @lesserwhirls ??

ndp-opendap avatar Mar 20 '25 00:03 ndp-opendap

Unfortunately, the TDS does not even generate a DMR for this file (dataset link), so netCDF-C does not have a chance through no fault of its own.

That said, netCDF-Java is able to open the test.opendap.org dap4 endpoint, but fails when trying to read data from any of the variables. I will open separate tickets on the appropriate repos for both of these issues.

lesserwhirls avatar Mar 20 '25 01:03 lesserwhirls

On initial investigation, I am seeing the following:

  1. ncdump is giving the error NC_NAMEINUSE ("NetCDF: String match to name in use")
  2. It appears that the name causing the problem is "measurement_to_detector_row_table_t"
  3. When I look at the metadata for the file: http://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5, then it appears that the name "measurement_to_detector_row_table_t" does not appear in the metadata. I looked using both h5dump and ncdump.

Can anyone else verify?

So it appears that a name was changed. But I do not know what or where.

DennisHeimbigner avatar Mar 20 '25 04:03 DennisHeimbigner

I see that the DMR has two instances of something close:

curl -s "http://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5.dmr" > foo.dmr

cat foo.dmr | grep -A 6 measurement_to_detector_row_table
                <Structure name="measurement_to_detector_row_table">
                    <Int16 name="det_start_row"/>
                    <Int16 name="det_end_row"/>
                    <Dim name="/BAND7_IRRADIANCE/STANDARD_MODE/time"/>
                    <Dim name="/BAND7_IRRADIANCE/STANDARD_MODE/scanline"/>
                    <Dim name="/BAND7_IRRADIANCE/STANDARD_MODE/pixel"/>
                </Structure>
--
                <Structure name="measurement_to_detector_row_table">
                    <Int16 name="det_start_row"/>
                    <Int16 name="det_end_row"/>
                    <Dim name="/BAND8_IRRADIANCE/STANDARD_MODE/time"/>
                    <Dim name="/BAND8_IRRADIANCE/STANDARD_MODE/scanline"/>
                    <Dim name="/BAND8_IRRADIANCE/STANDARD_MODE/pixel"/>
                </Structure>

And that chimes with what I see with h5dump:

h5dump -n 1 S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc | grep "measurement_to_detector_row_table" 
 dataset    /BAND7_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table
 attribute  /BAND7_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/DIMENSION_LIST
 attribute  /BAND7_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/_FillValue
 attribute  /BAND7_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/comment
 attribute  /BAND7_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/units
 dataset    /BAND8_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table
 attribute  /BAND8_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/DIMENSION_LIST
 attribute  /BAND8_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/_FillValue
 attribute  /BAND8_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/comment
 attribute  /BAND8_IRRADIANCE/STANDARD_MODE/INSTRUMENT/measurement_to_detector_row_table/units

But sadly, nothing about exactly what @DennisHeimbigner saw: measurement_to_detector_row_table_t

ndp-opendap avatar Mar 20 '25 14:03 ndp-opendap

@kyang2014 said: For the issue:

ncdump -h "dap4://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5"
(d4meta.c:499) Error:Inferred type name conflict
ncdump: dap4://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5: NetCDF: String match to name in use

After checking the dmrpp file and original file: I found the original file contains the same structure (the same structure name and field names) under different groups. Somehow the netCDF-C doesn't support this in its implementation, which needs to be fixed. I duplicate this issue with a simple example: http://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_same_names.h5 More information is present in the next comment.

ndp-opendap avatar Mar 26 '25 17:03 ndp-opendap

@kyang2014 The h5dump output of the original file is:

h5dump compound_group_same_names.h5
HDF5 "compound_group_same_names.h5" {
GROUP "/" {
   DATASET "DSC_memb_array" {
      DATATYPE  H5T_COMPOUND {
         H5T_STD_I32LE "Orbit";
         H5T_ARRAY { [3] H5T_IEEE_F32LE } "Temperature";
      }
      DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
      DATA {
      (0): {
            1153,
            [ 53.23, 53.87, 54.12 ]
         },
      (1): {
            1184,
            [ 55.12, 55.95, 56.25 ]
         }
      }
   }
   GROUP "g" {
      DATASET "DSC_memb_array" {
         DATATYPE  H5T_COMPOUND {
            H5T_STD_I32LE "Orbit";
            H5T_ARRAY { [3] H5T_IEEE_F32LE } "Temperature";
         }
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): {
               2153,
               [ 63.23, 63.87, 64.12 ]
            },
         (1): {
               2184,
               [ 65.12, 65.95, 66.25 ]
            }
         }
      }
   }
}

Note there are two same structure(compound datatype) variables DSC_memb_array. One is under the root group, another one is under the group /g.

The dmr output of this file can be found under:

http://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_same_names.h5.dmr.xml

When running ncdump, it generate the same error message.

ncdump -h dap4://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_same_names.h5
(d4meta.c:499) Error:Inferred type name conflict
ncdump: dap4://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_same_names.h5: NetCDF: String match to name in use

However when running a different example when the structure name is different under different group, ncdump -h works.

ncdump -h dap4://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_simple.h5

netcdf compound_group_simple {
types:
  compound DSC_memb_array_t {
    int Orbit ;
    float Temperature(3) ;
  }; // DSC_memb_array_t
dimensions:
	_Anonymous3 = 3 ;
	_Anonymous2 = 2 ;
variables:
	DSC_memb_array_t DSC_memb_array(_Anonymous2) ;

group: g {
  types:
    compound DSC_t {
      int Orbit ;
      float Temperature ;
    }; // DSC_t
  variables:
  	DSC_t DSC(_Anonymous2) ;
  } // group g
}

ndp-opendap avatar Mar 26 '25 17:03 ndp-opendap

@kyang2014 I did another test and see if I can use ncdump to access the simple structure data like this:

ncdump dap4://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_simple.h5
```
but I got the following error:
```
checksumhack=0
Error:Checksum mismatch: DSC_memb_array

NetCDF: DAP failure
Location: file vardata.c; line 478

```

ndp-opendap avatar Mar 26 '25 17:03 ndp-opendap

After some debugging, I found there may be an issue in the netCDF's checksum checking code for structure type and maybe for string type(in another check). inside libdap4/d4data.c, I found the following code that generates the checksum error:

 if(!meta->ignorechecksums) {
        for(i=0;i<nclistlength(toplevel);i++) {
            NCD4node* var = (NCD4node*)nclistget(toplevel,i);
            if(var->data.remotechecksummed) {
                if(var->data.localchecksum != var->data.remotechecksum) {
                    nclog(NCLOGERR,"Checksum mismatch: %s\n",var->name);
                    ret = NC_EDAP;
                    goto done;
                }
                /* Also verify checksum attribute */
                if(var->data.checksumattr) {
                    if(var->data.attrchecksum != var->data.remotechecksum) {
                        nclog(NCLOGERR,"Attribute Checksum mismatch: %s\n",var->name);
                        ret = NC_EDAP;
                        goto done;
                    }
                }
            }
        }
    }

If I turn off this block of code(essentially I tell the netCDF not to check the checksum), I found I can get the correct output:

./ncdump dap4://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_simple2.h5
netcdf compound_group_simple2 {
dimensions:
	_Anonymous2 = 2 ;

group: g {
  types:
    compound DSC_t {
      int Orbit ;
      float Temperature ;
    }; // DSC_t
  variables:
  	DSC_t DSC(_Anonymous2) ;
  data:

checksumhack=0
   DSC = {1213, 33.56}, {1234, 34.78} ;
  } // group g
}

The h5dump output of the original file is:

h5dump compound_group_simple2.h5
HDF5 "compound_group_simple2.h5" {
GROUP "/" {
   GROUP "g" {
      DATASET "DSC" {
         DATATYPE  H5T_COMPOUND {
            H5T_STD_I32LE "Orbit";
            H5T_IEEE_F32LE "Temperature";
         }
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): {
               1213,
               33.56
            },
         (1): {
               1234,
               34.78
            }
         }
      }
   }
}
}

So if we ignore the checksum feature, ncdump can correctly dump the data of this simple HDF5 file.

ndp-opendap avatar Mar 26 '25 17:03 ndp-opendap

checksum also fails for the string data: Check this one: ncdump -v "/PROCESSOR/job_configuration" "dap4://test.opendap.org/opendap/GESDISC/S5P_OFFL_L1B_RA_BD8_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.dmrpp"

The error message:

checksumhack=0
Error:Checksum mismatch: job_configuration
NetCDF: DAP failure
Location: file vardata.c; line 478
   job_configuration = % 

If I ignore the checksum feature, after long waiting time(minutes) I can see the data:

job_configuration = "Processing_Mode = OFFL", "Threads = 9", 
      "algo.DataDictionaryCheck.file = /mnt/sw/IPF_S5P_L01b/current/cfg/trl01b.ddcheck.xml", 
      "application.argc = 1", ......

kyang2014 avatar Mar 26 '25 20:03 kyang2014

I checked the URL you sent, and it seems to work for me using the current main branch of netcdf-c. What version of netcdf-c are you using?

DennisHeimbigner avatar Mar 26 '25 21:03 DennisHeimbigner

But there is a problem with deciding checksums: see this at the spec repository: https://github.com/OPENDAP/dap4-specification/discussions/6

DennisHeimbigner avatar Mar 26 '25 21:03 DennisHeimbigner

I checked the URL you sent, and it seems to work for me using the current main branch of netcdf-c. What version of netcdf-c are you using?

I am using the 4.9.0 release.

kyang2014 avatar Mar 27 '25 11:03 kyang2014

I think this issue is now resolvced.

It appears that this was fixed by a combination of making dap4 checksums a truly optional in Hyrax and by a change to the netcdf-c library in version 4.9.3

Our current deployment to test.opendap.org provides (correctly) optional dap4 checksums and when tested against 4.9.3 we see (via Sean Arms):

Interesting. I get that checksum error with 4.9.2, but here is what I get with 4.9.3:

$ ncdump "dap4://[test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_simple.h5?dap4.checksum=true](http://test.opendap.org/opendap/nasa-ngap/structure_tests/compound_group_simple.h5?dap4.checksum=true)"
netcdf compound_group_simple {
types:
  compound DSC_memb_array_t {
    int Orbit ;
    float Temperature(3) ;
  }; // DSC_memb_array_t
dimensions:
_Anonymous_Dim_3 = 3 ;
_Anonymous_Dim_2 = 2 ;
variables:
DSC_memb_array_t DSC_memb_array(_Anonymous_Dim_2) ;
data:

 DSC_memb_array = {1153, {53.23, 0, 0}}, {1113029345, {54.12, 0, 0}} ;

group: g {
  types:
    compound DSC_t {
      int Orbit ;
      float Temperature ;
    }; // DSC_t
  variables:
  DSC_t DSC(_Anonymous_Dim_2) ;
  data:

   DSC = {1213, 33.56}, {1234, 34.78} ;
  } // group g
}

So perhaps a bug that was successfully wrangled in the C library?

Cheers,

Sean

ndp-opendap avatar Oct 07 '25 17:10 ndp-opendap

Great, thank you!

WardF avatar Oct 07 '25 19:10 WardF

Hi folks,

I work at the GES_DISC and we are trying to promote that granule (S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc) to cloud OPeNDAP.

On my end when I call ncdump -h with dap4 I am still getting that error. I am using netcdf library version 4.9.3 of Sep 15 2025 23:01:51 $.

ncdump -h "dap4://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5"

Returns

ncdump: dap4://test.opendap.org:8080/opendap/GESDISC/S5P_OFFL_L1B_IR_SIR_20180430T001950_20180430T020120_02818_01_010000_20180430T035011.nc.h5: NetCDF: String match to name in use

eni-awowale avatar Oct 08 '25 15:10 eni-awowale

I assume there is an attempt to define two things with the same name. Let me see if I can find it.

DennisHeimbigner avatar Oct 08 '25 22:10 DennisHeimbigner