gdal icon indicating copy to clipboard operation
gdal copied to clipboard

Add NSIDCbin (National Snow and Ice Data Centre Sea Ice Concentrations) raster driver

Open mdsumner opened this issue 3 years ago • 10 comments

Supported by GDAL for read access. This format is a raw binary format for the Nimbus-7 SMMR and DMSP SSM/I-SSMIS Passive Microwave Data sea ice concentrations. There are daily and monthly maps in the north and south hemispheres supported by this driver.

This is very early, I wanted to submit to get visibility and as a place to answer remaining questions.

A simple test and example file is included.

Original discussion query was made here: https://lists.osgeo.org/pipermail/gdal-dev/2015-July/042280.html

Tasklist

  • [x] resolve issue of which auth - change in time, no use non-deprecated: https://nsidc.org/data/user-resources/help-center/guide-nsidcs-polar-stereographic-projection
  • [x] Add test case(s)
  • [x] Add documentation (include examples to download file/s, hit ftp with vsi:// etc)
  • [x] Review
  • [x] Adjust for comments
  • [x] All CI builds and checks have passed
  • [ ] check idiom/usage or need for members p, osSRS, sHeader, eRasterDataType I just don't entirely understand this yet

optional extras, future work

  • [ ] resolve scaling options and missing/zero (https://lists.osgeo.org/pipermail/gdal-dev/2022-August/056144.html - undecided whether Byte, Float32 default, or more complex dataset to allow config)
  • [ ] explore and possibly include other NSIDC binary files (e.g. https://nsidc.org/data/nsidc-0046/versions/4)

mdsumner avatar Aug 11 '22 04:08 mdsumner

for the cppcheck warnings about unused private members (https://github.com/OSGeo/gdal/runs/7781642042?check_suite_focus=true), you can use patterns like

CPL_IGNORE_RET_VAL(poDS->sHeader.foo);

rouault avatar Aug 11 '22 08:08 rouault

thank you @rouault ! very much appreciated feedback

mdsumner avatar Aug 12 '22 00:08 mdsumner

GetSpatialRef (gdalinfo crashes though)

The crash is completely unrelated:

$ valgrind gdalinfo nt_20220409_f18_nrt_s.bin 
==701859== Invalid write of size 4
==701859==    at 0x5387E0C: NSIDCbinRasterBand::GetNoDataValue(int*) (nsidcbindataset.cpp:185)
==701859==    by 0x5A98616: GDALNoDataMaskBand::GDALNoDataMaskBand(GDALRasterBand*) (gdalnodatamaskband.cpp:68)
==701859==    by 0x5A4993E: GDALRasterBand::GetMaskBand() (gdalrasterband.cpp:6828)
==701859==    by 0x5A4A0C2: GDALRasterBand::GetMaskFlags() (gdalrasterband.cpp:7003)
==701859==    by 0x5A4A145: GDALGetMaskFlags (gdalrasterband.cpp:7024)
==701859==    by 0x5B86690: GDALInfo (gdalinfo_lib.cpp:1241)
==701859==    by 0x10990C: main (gdalinfo_bin.cpp:231)
==701859==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

The issue is that you dereference pbSuccess without checking it for != nullptr before (callers may pass a nullptr pbSuccess pointer)

rouault avatar Aug 16 '22 11:08 rouault

Thanks @rouault your guidance is very very good, much appreciated.

For now I'm leaving as Byte with full range of values (special values > 250). I really want this for use with the warper I will likely have to do what was suggested on gdal-dev:

NSIDCbinRasterBand should probably have a RawRasterBand member variable instead, and implement IReadBlock() t

https://lists.osgeo.org/pipermail/gdal-dev/2022-August/056144.html

That post and the feedback really cleared up for me how things work. 🙏

mdsumner avatar Aug 16 '22 12:08 mdsumner

https://github.com/OSGeo/gdal/runs/7857751581?check_suite_focus=true:

 /__w/gdal/gdal/doc/source/drivers/raster/nsidcbin.rst:55: WARNING: Literal block ends without a blank line; unexpected unindent.
/__w/gdal/gdal/doc/source/drivers/raster/nsidcbin.rst:60: ERROR: Unexpected indentation.
/__w/gdal/gdal/doc/source/drivers/raster/nsidcbin.rst:66: WARNING: Definition list ends without a blank line; unexpected unindent.
/__w/gdal/gdal/doc/source/drivers/raster/nsidcbin.rst:89: WARNING: Definition list ends without a blank line; unexpected unindent.
/__w/gdal/gdal/doc/source/drivers/raster/nsidcbin.rst:90: WARNING: Block quote ends without a blank line; unexpected unindent.
/__w/gdal/gdal/doc/source/drivers/raster/nsidcbin.rst:94: ERROR: Unexpected indentation.
/__w/gdal/gdal/doc/source/drivers/raster/nsidcbin.rst:101: WARNING: Block quote ends without a blank line; unexpected unindent.

rouault avatar Aug 17 '22 09:08 rouault

We now have linting rules of autotest python files You should install & setup the pre-commit utility with (once for all for the repository):

python -m pip install pre-commit
pre-commit install

And to make it run then on modified files: pre-commit run --all-files.

rouault avatar Aug 22 '22 18:08 rouault

excellent thanks again 🙏

mdsumner avatar Aug 23 '22 10:08 mdsumner

this format has been replaced by a (decent afaics) NetCDF upgrade:

https://nsidc.org/data/nsidc-0051/versions/2

I'd still like this legacy driver to be included, and I'm prepared to continue maintenance - we use this extensively and it will be helpful for me to have it as alternative pathway during various upgrades going on - but I can understand if this changes the decision on inclusion.

mdsumner avatar Sep 05 '22 10:09 mdsumner

I'd still like this legacy driver to be included

if there are significant amount of past data using the legacy format and likely not being converted and published into the new netCDF format, that probably justifies keeping the driver

rouault avatar Sep 05 '22 17:09 rouault

Thanks Even, I think they'll all be covered by the new form - will take a few days for our sync to catch up but I'll check a few things and report back more fully before I shut this down.

mdsumner avatar Sep 06 '22 04:09 mdsumner

@mdsumner any update ?

rouault avatar Sep 23 '22 15:09 rouault

still doing checks and things I'd like to compare - there are particular netcdf oddities - for example they seem to store days without data as a shell netcdf, and there's a few of these things to investigate - but overall it would be very handy to have this for ongoing work, it will be some time until we can confidently move completely to the new files, and there will be many copies of these older version files around, so - I'd still like it to be included in a future GDAL release. 🙏

mdsumner avatar Sep 23 '22 20:09 mdsumner

The main problem I foresee is emails from folks telling us there's a netcdf version now 😄 - I would like this to go in, but the fact is I can do my work in a dev fork locally so I'm happy if you decide not to include it. I'm committed to maintaining it, it's a good lever for me to learn the next level of RasterBand customization. 🙏

mdsumner avatar Sep 24 '22 02:09 mdsumner

Another pitch, I really want to have it available for ease of comparing virtual mdim approaches - is it really better with netcdf ... etc - and what does Zarr (or similar) creation look like for identical data from different formats, an active area of investigation for us coming up.

mdsumner avatar Sep 24 '22 02:09 mdsumner

I'd still like it to be included in a future GDAL release

ok, makes sense

rouault avatar Sep 24 '22 10:09 rouault

hey I'd really like to get this in to 3.7, sorry for leaving this note for so long, but is there style issues or real problems left over? my rationale has changed: it's essentially, we use these binary files every day, and we have to migrate to a new netdf form of them (the binaries aren't being created for new daily data ), but having them side by side in gdal will make the transition much easier, and all the ensuing discussions ...

apologies I had meant to do more homework (I have been learning a lot), but if there's no huge blocker I'd love to see this get into 3.7 , I don't care about the band scaling stuff

mdsumner avatar Feb 15 '23 14:02 mdsumner

  • There were test failures: https://github.com/OSGeo/gdal/actions/runs/3739616110/jobs/6347049360
  • This needs to be rebased on top of latest HEAD "git fetch origin master; git rebase origin/master" (assuming "origin" is the name of the git remove pointing to OSGeo/gdal)

rouault avatar Feb 15 '23 15:02 rouault

I struggled with the rebase, feel like I haven't done it right, I was incrementally merging in master ...

Not given up I just can't finish with the checks rn 🙏

mdsumner avatar Feb 16 '23 10:02 mdsumner

your PR is in a wrong state with hundreds of files appearing modified. You probably git merge instead of git rebase. If you can't find the right git incantation to have a single git commit on top of latest master, it would probably be best that you copy the files you added/edited somewhere, and recreate a clean branch and issue a new PR.

rouault avatar Feb 16 '23 12:02 rouault

your PR is in a wrong state with hundreds of files appearing modified. You probably git merge instead of git rebase.

Even a git merge (with the right branch) wouldn't indicate so many files on Github. After the merge, most files should be identical. Did you pull the latest HEAD of master from this repo? Did you perhaps rebase on an outdated local master brauch?

dg0yt avatar Feb 16 '23 12:02 dg0yt

yes sorry, I really messed it up, I never use rebase and don't really understand it

I'll start again

mdsumner avatar Feb 16 '23 12:02 mdsumner