netcdf-java icon indicating copy to clipboard operation
netcdf-java copied to clipboard

[5.5.3+]: Failure to build netcdfAll with optional zarr

Open rschmunk opened this issue 3 years ago • 5 comments

Versions impacted by the bug

v5.x

What went wrong?

A user developing a data product asked whether it would be possible to modify Panoply to access data in zarr directories or in zarr zip files. After looking through documentation, it looked like the first step was to build a netcdfAll where the config is modified to include cdm:zarr.

Using a netcdf-5.5.3 SNAPSHOT from January 2022, I succeeded in doing so by adding

netcdfAll project(':cdm:cdm-zarr')

at line 54 of fatJars.gradle. And when using the resulting netcdfAll jar I have been able to successfully open an example zarr zip file.

However, if I try to start from a later netcdf-5.5.3 SNAPSHOT (e.g., one from April), or the 5.5.3 release, or the current 5.5.4 SNAPSHOT, the build fails with a complaint about missing dependencies. As follows...

* What went wrong:
Could not determine the dependencies of task ':buildNetcdfAll'.
> Could not resolve all dependencies for configuration ':netcdfAll'.
   > Could not find com.fasterxml.jackson.core:jackson-core:.
     Required by:
         project : > project :cdm:cdm-zarr
   > Could not find com.fasterxml.jackson.core:jackson-databind:.
     Required by:
         project : > project :cdm:cdm-zarr

Is this due to a bug in the gradle config for cdm:zarr? Is there something else in the config that I need to edit in order to include cdm:zarr in the netcdfAll build? My knowledge of gradle remains sketchy enough that I wouldn't be surprised if it is the latter.

Also, while looking into this, I ran into the issue that ZarrHeader does not bother to apply dimension names that are provided via the _ARRAY_DIMENSIONS variable attribute. I'll try to upload a pull request to fix that soon, but I'd like to figure out the build problem first.

Relevant stack trace

No response

Relevant log messages

No response

If you have an example file that you can share, please attach it to this issue.

If so, may we include it in our test datasets to help ensure the bug does not return once fixed? Note: the test datasets are publicly accessible without restriction.

No

Code of Conduct

  • [X] I agree to follow the UCAR/Unidata Code of Conduct

rschmunk avatar Jul 07 '22 00:07 rschmunk

@rschmunk really glad people are starting to try out the Zarr package! I can reproduce the error you're getting, it seems to be related to transitive dependencies (or loss of) from upgrading the AWS SDK, and I'm troubleshooting it now.

As for the the _ARRAY_DIMENSIONS attribute, it is not read because it's not part of the core Zarr spec (to my knowledge, it was introduced as an extension by XArray). Currently we only support pure Zarr v2, though we do have plans to add support for NCZarr and Zarr v3. We hadn't discussed other Zarr extension attributes, but we'd definitely welcome contributions!

haileyajohnson avatar Jul 07 '22 18:07 haileyajohnson

@haileyajohnson, No idea where you might be in figuring his out, but as an FYI, even when using the older NJ snapshot with Zarr included, I am experiencing various errors trying to read/plot data. And this even when the sample data are uncompressed.

rschmunk avatar Jul 14 '22 04:07 rschmunk

What kind of errors?

haileyajohnson avatar Jul 14 '22 19:07 haileyajohnson

In a couple sample zarr cases of what is supposed to be lon-lat gridded data, I can plot the the 2D data, albeit not geo-referenced because the default dimension naming prevents creation of a coordinate system.

But... I cannot make a line plot of either the lon and lat coordinate vars. In one case, there is an NPE percolating up from RandomAccessFile. in the other, there is a ClassCastException trying to cast a String to a Number that is being thrown in IospHelper.makePrimitiveArray.

rschmunk avatar Jul 15 '22 01:07 rschmunk

Comparing .zarray attributes in the 2 sample uncompressed files, I believe the String to Number ClassCastException case is occurring because the variable has a fill_value attribute of "NAN" rather than something that looks numeric.

Confirmed. Way down in IospHelper, it is trying to cast the passed fill value object to a Number of the appropriate data type. Does not work with a fill value that is a String "NAN", nor the other acceptable String choices of "Infinity" or "-Infinity".

rschmunk avatar Jul 23 '22 03:07 rschmunk

I'm going to close this issue since #1098 , but definitely encourage you to open new issues as you inevitable encounter more issues with Zarr

haileyajohnson avatar Oct 17 '22 16:10 haileyajohnson

@haileyajohnson, I referred to a couple problems reading Zarr variables above, but I was thinking of submitting those as separate issues. One of them I am trying to write a patch for.

rschmunk avatar Oct 18 '22 02:10 rschmunk

Separate issues (and patches) would be really helpful, thank you!

haileyajohnson avatar Oct 18 '22 14:10 haileyajohnson