xarray icon indicating copy to clipboard operation
xarray copied to clipboard

correctly encode/decode _FillValues/missing_values/dtypes for packed data

Open kmuehlbauer opened this issue 1 year ago • 5 comments

  • [x] Closes #7691
  • [x] Closes #5597
  • [x] Closes #2304
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

This resurrects some of #7654. It takes special care of correctly handling dtypes when encoding/decoding packed data

kmuehlbauer avatar Feb 06 '24 08:02 kmuehlbauer

Still an issue with some example in the docs. Probably non-conforming NaN _FillValue for packed data in that one example.

kmuehlbauer avatar Feb 06 '24 15:02 kmuehlbauer

@mankoff if you have time, can you check your workload against this PR, or review it please?

dcherian avatar Feb 06 '24 15:02 dcherian

@mankoff if you have time, can you check your workload against this PR, or review it please?

It would be great if Ken could have a look here. At least I tried to follow the changes to CF which followed after the discussion in https://github.com/cf-convention/cf-conventions/issues/374.

kmuehlbauer avatar Feb 07 '24 21:02 kmuehlbauer

On 2024-02-07 at 04:54 +13, Deepak Cherian @.***> wrote...

@mankoff if you have time, can you check your workload against this PR, or review it please?

I'm reading this via slow satellite in Antarctica. I'm 'offline'(ish) for another week, then have 2 months of emails and work to catch up on. Reviewing this PR will be low priority.

mankoff avatar Feb 07 '24 21:02 mankoff

I'm reading this via slow satellite in Antarctica. I'm 'offline'(ish) for another week, then have 2 months of emails and work to catch up on. Reviewing this PR will be low priority.

Thanks Ken, for letting us know. Greetings to Antarctica, hope time is good there! I'll try to ping some folks from the linked issues to get more input here.

kmuehlbauer avatar Feb 07 '24 21:02 kmuehlbauer

@kmuehlbauer shall we merge? A number of numpy warnings (and presumably numpy 2 failures) are from this type of dtype maniputation. 🤞🏾 this PR fixes them all! :)

dcherian avatar Mar 15 '24 04:03 dcherian

xarray/tests/test_conventions.py::test_decode_cf_with_conflicting_fill_missing_value
  /home/runner/work/xarray/xarray/xarray/conventions.py:286: SerializationWarning: variable 't' has non-conforming '_FillValue' nan defined, dropping '_FillValue' entirely.
    var = coder.decode(var, name=name)

should silence this warning

dcherian avatar Mar 15 '24 05:03 dcherian

@dcherian I'm good to merge this, we still can iterate later if complaints are coming in.

kmuehlbauer avatar Mar 15 '24 06:03 kmuehlbauer

@dcherian There are still some warnings which could be fixed/silenced with this PR. All try to get behind it now.

Update: so some warnings (cast RuntimeWarnings) are not directly connected to this PR but to some ill-defined test setup (mixing int/uint) with resulting casting issues. Best resolved in separate PR.

kmuehlbauer avatar Mar 15 '24 06:03 kmuehlbauer

LGTM. Thanks!

dcherian avatar Mar 15 '24 16:03 dcherian