satpy Fix data type when getting a line offset for a segmented hrit

Fixes the following uint8 overflow when loading Himawari hrit data distributed by EumetCast:

Traceback (most recent call last):
  File "/Work/meteo/satpy-tests/./test5.py", line 18, in <module>
    scn = Scene(filenames=filenames, reader='ahi_hrit')
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/scene.py", line 155, in __init__
    self._readers = self._create_reader_instances(filenames=filenames,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/scene.py", line 176, in _create_reader_instances
    return load_readers(filenames=filenames,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/__init__.py", line 580, in load_readers
    reader_instance.create_storage_items(
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/yaml_reader.py", line 617, in create_storage_items
    return self.create_filehandlers(files, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/yaml_reader.py", line 1180, in create_filehandlers
    created_fhs = super(GEOSegmentYAMLReader, self).create_filehandlers(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/yaml_reader.py", line 629, in create_filehandlers
    filehandlers = self._new_filehandlers_for_filetype(filetype_info,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/yaml_reader.py", line 612, in _new_filehandlers_for_filetype
    return list(filtered_iter)
           ^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/yaml_reader.py", line 594, in filter_fh_by_metadata
    for filehandler in filehandlers:
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/yaml_reader.py", line 590, in _new_filehandler_instances
    yield filetype_cls(filename, filename_info, filetype_info, *req_fh, **fh_kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/hrit_jma.py", line 278, in __init__
    self.area = self._get_area_def()
                ^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/hrit_jma.py", line 365, in _get_area_def
    "loff": self._get_line_offset(),
            ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Work/meteo/venv/lib/python3.12/site-packages/satpy/readers/hrit_jma.py", line 349, in _get_line_offset
    loff -= (self.mda["total_no_image_segm"] - segment_number - 1) * nlines
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
OverflowError: Python integer 550 out of bounds for uint8

Oct 15 '24 14:10 k3a

@k3a thanks for your contribution! Do you think you can add a little test for this (to ensure that the bug doesn't come back in the future), and add your name to the Authors files?

Oct 15 '24 16:10 mraspaud

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 96.08%. Comparing base (8df630d) to head (f36a096). Report is 273 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2930      +/-   ##
==========================================
- Coverage   96.08%   96.08%   -0.01%     
==========================================
  Files         377      377              
  Lines       55134    55125       -9     
==========================================
- Hits        52976    52967       -9     
  Misses       2158     2158

Flag	Coverage Δ
behaviourtests	`3.94% <0.00%> (+<0.01%)`	:arrow_up:
unittests	`96.18% <100.00%> (-0.01%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Oct 17 '24 07:10 codecov[bot]

Pull Request Test Coverage Report for Build 11632483694

Details

3 of 3 (100.0%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 96.191%

Totals
Change from base Build 11628253798:	0.0%
Covered Lines:	53211
Relevant Lines:	55318

💛 - Coveralls

Oct 17 '24 07:10 coveralls

No new tests seem necessary. I have just updated the _get_mda helper function to convert into the proper numpy types according to the dtype specs.

Oct 19 '24 10:10 k3a

If the existing tests didn't fail before your fix then we need new tests that would fail and now pass unless I'm misunderstanding something.

Oct 19 '24 14:10 djhoese

I don't have an up-to-date environment at hand, but I think the modified data represents more closely the actual data, so might fail without the changes in the code. I'll see if I have the time to create a new env to test with.

Oct 19 '24 14:10 pnuu

I have, of course, tested that the current tests will fail without the fix. There are tests testing the case with segmented hrit but that _get_mda needs to retype args into proper np types to trigger this bug because when data is parsed it is parsed into those concrete types, not a generic python ints:

FAILED test_ahi_hrit.py::TestHRITJMAFileHandler::test_get_area_def - OverflowError: Python integer 275 out of bounds for uint8
FAILED test_ahi_hrit.py::TestHRITJMAFileHandler::test_get_dataset - OverflowError: Python integer 275 out of bounds for uint8
FAILED test_ahi_hrit.py::TestHRITJMAFileHandler::test_init - OverflowError: Python integer 11000 out of bounds for uint8
FAILED test_ahi_hrit.py::TestHRITJMAFileHandler::test_mask_space - OverflowError: Python integer 275 out of bounds for uint8

Oct 19 '24 14:10 k3a

Closes #2963

Nov 01 '24 16:11 pnuu

Could you merge the Pytroll main to fix the failing tests?

And maybe next time create the PR from a new branch, I can't seem to be doing the update to your main.

Nov 01 '24 16:11 pnuu

Fix data type when getting a line offset for a segmented hrit_jma

Codecov Report

Pull Request Test Coverage Report for Build 11632483694

Details

💛 - Coveralls