erddap
erddap copied to clipboard
Complex ncml test case EDDGridFromNcFilesTests.testNcml failing
Describe the bug
Getting ahead of things a little bit since #142 is not yet merged, but wanted a dedicated issue to track EDDGridFromNcFilesTests.testNcml
issues.
Currently, with netcdf-java 5.5.3
dependencies, the following error results when loading a complex union ncML file in EDDGridFromNcFilesTests.testNcml
java.lang.NullPointerException: Cannot invoke "String.contains(java.lang.CharSequence)" because "location" is null
at thredds.inventory.zarr.MFileZip$Provider.canProvide(MFileZip.java:200)
at thredds.inventory.MFiles.create(MFiles.java:37)
at ucar.nc2.internal.ncml.AggDataset.<init>(AggDataset.java:74)
at ucar.nc2.internal.ncml.Aggregation.makeDataset(Aggregation.java:453)
at ucar.nc2.internal.ncml.Aggregation.addExplicitDataset(Aggregation.java:136)
at ucar.nc2.internal.ncml.NcmlReader.readAgg(NcmlReader.java:1476)
at ucar.nc2.internal.ncml.NcmlReader.readNetcdf(NcmlReader.java:521)
at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:478)
at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:397)
at ucar.nc2.internal.ncml.NcmlNetcdfFileProvider.open(NcmlNetcdfFileProvider.java:24)
at ucar.nc2.dataset.NetcdfDatasets.openProtocolOrFile(NetcdfDatasets.java:431)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:152)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:135)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:118)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:104)
at gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesTests.testNcml(EDDGridFromNcFilesTests.java:155)
This was originally reported to the netcdf-java mailing list by Bob Simons in July 2022.
The error stems from loading of MFileProvider
implementations using Java service loading. The canProvide(String location)
method is called of each implementation, and in 5.5.3 one particular provider MFileZip
location
isn't checked for null (https://github.com/Unidata/netcdf-java/blob/v5.5.3/cdm/zarr/src/main/java/thredds/inventory/zarr/MFileZip.java#L200).
This bug was fixed in October 2022 with this commit.
https://github.com/Unidata/netcdf-java/commit/19f9476ed8e605e04ab6013a90ba59dbbb2d17d3#diff-05b863736a1a2b21b57d0a498f731991e82c8b824dc2235d54f3f9d5f257eb80R200
However, this change hasn't yet been included in any release. I asked about the possibility of a 5.5.4 release here: https://github.com/Unidata/netcdf-java/discussions/1332
However, even with this fix (testing with netcdf-java 5.5.4-SNAPSHOT
), this test produces another error:
java.lang.IllegalStateException: Shared Dimension fakeDim0 = 4320; does not exist in a parent group
at ucar.nc2.Variable.<init>(Variable.java:1847)
at ucar.nc2.dataset.VariableDS.<init>(VariableDS.java:879)
at ucar.nc2.dataset.VariableDS$Builder.build(VariableDS.java:1134)
at ucar.nc2.dataset.VariableDS$Builder.build(VariableDS.java:985)
at ucar.nc2.Group.<init>(Group.java:924)
at ucar.nc2.Group.<init>(Group.java:44)
at ucar.nc2.Group$Builder.build(Group.java:1410)
at ucar.nc2.Group$Builder.build(Group.java:1402)
at ucar.nc2.NetcdfFile.<init>(NetcdfFile.java:2576)
at ucar.nc2.dataset.NetcdfDataset.<init>(NetcdfDataset.java:1611)
at ucar.nc2.dataset.NetcdfDataset.<init>(NetcdfDataset.java:88)
at ucar.nc2.dataset.NetcdfDataset$Builder.build(NetcdfDataset.java:1812)
at ucar.nc2.dataset.NetcdfDataset$Builder.build(NetcdfDataset.java:1687)
at ucar.nc2.internal.ncml.NcmlReader$NcmlElementReader.open(NcmlReader.java:1605)
at ucar.nc2.internal.ncml.NcmlReader$NcmlElementReader.open(NcmlReader.java:1586)
at ucar.nc2.dataset.NetcdfDatasets.acquireFile(NetcdfDatasets.java:383)
at ucar.nc2.internal.ncml.AggDataset.acquireFile(AggDataset.java:114)
at ucar.nc2.internal.ncml.AggregationUnion.buildNetcdfDataset(AggregationUnion.java:30)
at ucar.nc2.internal.ncml.Aggregation.build(Aggregation.java:349)
at ucar.nc2.internal.ncml.NcmlReader.readNetcdf(NcmlReader.java:528)
at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:483)
at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:385)
at ucar.nc2.internal.ncml.NcmlNetcdfFileProvider.open(NcmlNetcdfFileProvider.java:24)
at ucar.nc2.dataset.NetcdfDatasets.openProtocolOrFile(NetcdfDatasets.java:431)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:152)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:135)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:118)
at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:104)
at gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesTests.testNcml(EDDGridFromNcFilesTests.java:155)
This is related to the attempted renaming of the fakeDim
variables in the two of the three aggregated files:
$ grep dimension src/test/resources/largeFiles/viirs/MappedMonthly4km/m4.ncml
<dimension name="latitude" orgName="fakeDim0" />
<dimension name="longitude" orgName="fakeDim1" />
I haven't done thorough checking to see if src/test/resources/largeFiles/viirs/MappedMonthly4km/m4.ncml
is fully legal ncml, but the fake dimensions are indeed in the aggregated data files, and the target dimensions already exist in the LatLon.nc
file:
$ ncks --json -M src/test/resources/largeFiles/viirs/MappedMonthly4km/LatLon.nc | jq .dimensions
{
"latitude": 4320,
"longitude": 8640
}
$ ncks --json -M src/test/resources/largeFiles/viirs/MappedMonthly4km/V20120012012031.L3m_MO_NPP_CHL_chlor_a_4km | jq .dimensions
{
"fakeDim0": 4320,
"fakeDim1": 8640
}
$ ncks --json -M src/test/resources/largeFiles/viirs/MappedMonthly4km/V20120322012060.L3m_MO_NPP_CHL_chlor_a_4km | jq .dimensions
{
"fakeDim0": 4320,
"fakeDim1": 8640
}
To Reproduce
Steps to reproduce the behavior:
Run test case EDDGridFromNcFilesTests.testNcml
(example mvn test -Dtest=EDDGridFromNcFilesTests#testNcml
)
Expected behavior Test passes
Desktop (please complete the following information):
- OS: Linux (Debian 12)