netcdf-java icon indicating copy to clipboard operation
netcdf-java copied to clipboard

GDAL_NoData geotiff tag should be a string, not float

Open WeatherGod opened this issue 1 year ago • 6 comments

  • See https://gdal.org/drivers/raster/gtiff.html#nodata-value

Description of Changes

When greyscale is false, the data is scaled to the range of 1-255 before being written to the geotiff. The encoding value of 0 is reserved for missing/nodata values. The missing value is encoded in the GDAL_NoData geotiff tag, but we've been encoding it incorrectly all this time. This PR corrects that, as well as expands the geotiff tests to exercise both cases of greyscale being True and False.

PR Checklist

  • [ ] Link to any issues that the PR addresses
  • [ ] Add labels
  • [ ] Open as a draft PR until ready for review
  • [ ] Make sure GitHub tests pass
  • [x] Mark PR as "Ready for Review"

WeatherGod avatar Apr 19 '24 17:04 WeatherGod

Failing test is unrelated as it pertains to zarr, and it might be transient?

WeatherGod avatar Apr 19 '24 18:04 WeatherGod

Hi @WeatherGod - What versions of the GeoTiff Specification (1.0* or 1.1) are you targeting? (I'm not certain what netCDF-java currently supports but suspect it originally targeted 1.0.) It looks like GDAL supports both GeoTIFF 1.0 and 1.1, defaulting to 1.0 (see GEOTIFF_VERSION in GDAL docs Configuration Options section).

I'm not familiar with the differences. From the GDAL docs, maybe just "clear[ing up] ambiguities and fix[ing] inconsistencies". Either way, it's probably best to reference the specification documents than the GDAL documentation to ensure broad interoperability.

[*] FYI There's an illegible TOC button/image at the top of the v1.0 page that, embarrassingly, took me awhile to find.

ethanrd avatar Apr 19 '24 19:04 ethanrd

Actually, neither. GDAL_NoData is an extended TIFF tag that has been widely adopted. So much so that it is recognized by the Library of Congress: https://www.loc.gov/preservation/digital/formats/content/tiff_tags.shtml. It says it is ASCII encoded.

WeatherGod avatar Apr 19 '24 20:04 WeatherGod

By the way, I've also come across another set of bugs related to the encoding of the IFDEntry tags. It looks like it doesn't handle unsigned bytes correctly. None of that is tested (until my other PR exposed it).

WeatherGod avatar Apr 19 '24 20:04 WeatherGod

As for the IFDEntry bug, I can't exercise it explicitly with the current API, so I'm just going to have to punt that and have it get fixed by my other PR that exercises it by saving unsigned bytes data. Let me know if you need anything else for this PR.

WeatherGod avatar Apr 19 '24 21:04 WeatherGod

Sorry, I was trying to ask a more general question beyond the GDAL_NoData attribute/tag but didn't find a more general issue to comment on.

Support for "official" standards is a high-level goal for netCDF-Java (and Unidata more generally). Of course, full support for standards isn't always possible and community/implementation conventions are often needed (in addition to standards or as the main focus) for the broadest level of interperability (or simply because that's what gets implemented).

My question was, I guess, mainly about what specification you are using as a reference in your work on the netCDF-Java GeoTIFF code. It sounds like you are using GDAL and LoC GeoTIFF documentation as your main specification. So, I think I have my answer.

Thanks for all your work and contributions!

ethanrd avatar Apr 19 '24 23:04 ethanrd