satpy icon indicating copy to clipboard operation
satpy copied to clipboard

Native resampler to coarsest_area fails for HRV channel due to indivisible chunk sizes

Open ameraner opened this issue 3 years ago • 6 comments

Describe the bug Resampling the SEVIRI HRV channel (1km) into a VISIR channel area (3km) by using Scene.coarsest_area fails.

To Reproduce

from satpy import Scene
from satpy.utils import debug_on
debug_on()
import glob
files = glob.glob("/data/pytroll/testdata/seviri/20200828/H-000-MSG4__-MSG4*202008271200-__")
global_scene = Scene(reader="seviri_l1b_hrit", filenames=files)
global_scene.load(['hrv_clouds'], upper_right_corner='NE')
local_scene = global_scene.resample(global_scene.coarsest_area(), resampler='native')

or, using native data

test_image = '/test/data/path/MSG4-SEVI-MSG15-0100-NA-20210311121243.929000000Z-NA.nat'
scn = Scene(reader="seviri_l1b_native", filenames=[test_image], reader_kwargs={'fill_disk': True})
scn.load(['realistic_colors'], upper_right_corner='NE')
scn_r = scn.resample(scn.coarsest_area(), resampler='native')

Expected behavior Resample the composite (i.e. the HRV channel) to the 3km grid.

Actual results

The following datasets were not created and may require resampling to be generated: DataID(name='realistic_colors')
Traceback (most recent call last):
  File "/tcenas/home/andream/code/pycharm_install_pro/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/tcenas/home/andream/code/pycharm_install_pro/pycharm-2020.2/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/tcenas/home/andream/PycharmProjects/dev_src/dev_src/run_EUM_satpy_readers.py", line 310, in <module>
    scn_r = scn.resample(scn.coarsest_area(), resampler='native')
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/scene.py", line 817, in resample
    reduce_data=reduce_data, **resample_kwargs)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/scene.py", line 772, in _resampled_scene
    res = resample_dataset(dataset, destination_area, **kwargs)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1378, in resample_dataset
    new_data = resample(source_area, dataset, destination_area, fill_value=fill_value, **kwargs)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1341, in resample
    res = resampler_instance.resample(data, **kwargs)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 961, in resample
    **kwargs)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 433, in resample
    return self.compute(data, cache_id=cache_id, **kwargs)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1048, in compute
    d_arr = self.expand_reduce(data.data, repeats)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1014, in expand_reduce
    return cls.aggregate(d_arr, y_size, x_size)
  File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 976, in aggregate
    raise ValueError("Aggregation requires arrays with "
ValueError: Aggregation requires arrays with shapes and chunks divisible by the factor

Additional context

There's currently a bug in Satpy master that reverts the coarsest/finest logic for the original upside-down left-right flipped SEVIRI data. Using the upper_right_corner='NE' load kwarg, the image is flipped upright and the bug is avoided. Without the kwarg, this issue is replicated by using finest_area instead of coarsest_area (wrong behaviour). See https://github.com/pytroll/satpy/pull/1596

ameraner avatar Mar 12 '21 11:03 ameraner

The error comes from here: https://github.com/pytroll/satpy/blob/c74c396d402d56a6fdf29e141584554a531f9bc6/satpy/resample.py#L969-L974

The chunk sizes are for HRIT:

>>> global_scene['HRV'].chunks
((464, 464, 464, 464, 464, 464, 225, 239, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464), (1472, 1235, 2861, 2707, 2861))
>>> global_scene['IR_108'].chunks
((464, 464, 464, 464, 464, 464, 464, 464), (3712,))

for native:

scn['HRV'].chunks
Out[2]: ((3711, 385, 3326, 770, 64, 2880), (2863, 2705, 2863, 1233, 1472))

and some are not divisible by 3 (agg_size).

@djhoese @TAlonglong FYI

ameraner avatar Mar 12 '21 11:03 ameraner

Thank you for investigating on this @ameraner ! For the hrit, iirc it is implemented so that a chunks should be exactly one file in principle, and for hrv the file containing the upper-lower window split is a bit different. So maybe it's as easy as rechunking to make things compatible between hrv and non-hrv? eg rechunking the disk-filled hrv to (4643, 37123)?

mraspaud avatar Mar 12 '21 12:03 mraspaud

OK, I did some playing with this with HRIT data and I got it to work with this diff:

diff --git a/satpy/resample.py b/satpy/resample.py
index 6cf7d009..15e20172 100644
--- a/satpy/resample.py
+++ b/satpy/resample.py
@@ -968,6 +968,9 @@ class NativeResampler(BaseResampler):
             # it should be reshaped to do the averaging.
             raise ValueError("Can't aggregrate (reduce) data arrays with "
                              "more than 2 dimensions.")
+        if d.shape == (11136, 11136):
+            print("Is hrv")
+            d = d.rechunk((5568, 5568))
         if not (x_size.is_integer() and y_size.is_integer()):
             raise ValueError("Aggregation factors are not integers")
         for agg_size, chunks in zip([y_size, x_size], d.chunks):

So indeed a hack. Not sure if my output is correct, need to check further. So I'm not sure this will solve our problem.

TAlonglong avatar Mar 13 '21 16:03 TAlonglong

Is it possible to know the chunk size of the non-HRV channels inside the file handler (or reader) and rechunk the data relative to that?

djhoese avatar Mar 13 '21 18:03 djhoese

in the reader maybe. Given that non-HRV channels are loaded at the same time as the HRV...

mraspaud avatar Mar 15 '21 10:03 mraspaud

This also affects FCI:

import hdf5plugin
import pathlib
import os
from satpy import Scene
from glob import glob
bd = pathlib.Path("/media/nas/x21308/MTG_test_data/2022_05_MTG_Testdata/RC0040/")
sc = Scene(filenames=[os.fspath(f) for f in bd.glob("*BODY*.nc")],reader="fci_l1c_nc")
sc.load(["vis_04", "ir_133", "true_color_raw", "convection"], upper_right_corner="NE")
ls = sc.resample(sc.coarsest_area(), resampler="native")

fails with

Traceback (most recent call last):
  File "/data/gholl/checkouts/protocode/fci-resample.py", line 9, in <module>
    ls = sc.resample(sc.coarsest_area(), resampler="native")
  File "/data/gholl/checkouts/satpy/satpy/scene.py", line 955, in resample
    self._resampled_scene(new_scn, destination, resampler=resampler,
  File "/data/gholl/checkouts/satpy/satpy/scene.py", line 871, in _resampled_scene
    res = resample_dataset(dataset, destination_area, **kwargs)
  File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1418, in resample_dataset
    new_data = resample(source_area, dataset, destination_area, fill_value=fill_value, **kwargs)
  File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1381, in resample
    res = resampler_instance.resample(data, **kwargs)
  File "/data/gholl/checkouts/satpy/satpy/resample.py", line 981, in resample
    return super(NativeResampler, self).resample(data,
  File "/data/gholl/checkouts/satpy/satpy/resample.py", line 439, in resample
    return self.compute(data, cache_id=cache_id, **kwargs)
  File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1084, in compute
    d_arr = self._expand_reduce(data.data, repeats)
  File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1051, in _expand_reduce
    return cls._aggregate(d_arr, y_size, x_size)
  File "/data/gholl/checkouts/satpy/satpy/resample.py", line 999, in _aggregate
    raise ValueError("Aggregation requires arrays with "
ValueError: Aggregation requires arrays with shapes and chunks divisible by the factor

I have not tested if it also affects ABI.

gerritholl avatar Jul 29 '22 16:07 gerritholl

I just fixed some of these issues in #2291. If the math works for the overall array shapes then the arrays are rechunked as necessary. I think that means this is resolved and should work as expected if I'm understanding the issue completely.

I'm going to close this as I think it is fixed. Please reopen if it is still an issue.

djhoese avatar Dec 05 '22 14:12 djhoese