satpy
satpy copied to clipboard
Native resampler to coarsest_area fails for HRV channel due to indivisible chunk sizes
Describe the bug Resampling the SEVIRI HRV channel (1km) into a VISIR channel area (3km) by using Scene.coarsest_area fails.
To Reproduce
from satpy import Scene
from satpy.utils import debug_on
debug_on()
import glob
files = glob.glob("/data/pytroll/testdata/seviri/20200828/H-000-MSG4__-MSG4*202008271200-__")
global_scene = Scene(reader="seviri_l1b_hrit", filenames=files)
global_scene.load(['hrv_clouds'], upper_right_corner='NE')
local_scene = global_scene.resample(global_scene.coarsest_area(), resampler='native')
or, using native data
test_image = '/test/data/path/MSG4-SEVI-MSG15-0100-NA-20210311121243.929000000Z-NA.nat'
scn = Scene(reader="seviri_l1b_native", filenames=[test_image], reader_kwargs={'fill_disk': True})
scn.load(['realistic_colors'], upper_right_corner='NE')
scn_r = scn.resample(scn.coarsest_area(), resampler='native')
Expected behavior Resample the composite (i.e. the HRV channel) to the 3km grid.
Actual results
The following datasets were not created and may require resampling to be generated: DataID(name='realistic_colors')
Traceback (most recent call last):
File "/tcenas/home/andream/code/pycharm_install_pro/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/tcenas/home/andream/code/pycharm_install_pro/pycharm-2020.2/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/tcenas/home/andream/PycharmProjects/dev_src/dev_src/run_EUM_satpy_readers.py", line 310, in <module>
scn_r = scn.resample(scn.coarsest_area(), resampler='native')
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/scene.py", line 817, in resample
reduce_data=reduce_data, **resample_kwargs)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/scene.py", line 772, in _resampled_scene
res = resample_dataset(dataset, destination_area, **kwargs)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1378, in resample_dataset
new_data = resample(source_area, dataset, destination_area, fill_value=fill_value, **kwargs)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1341, in resample
res = resampler_instance.resample(data, **kwargs)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 961, in resample
**kwargs)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 433, in resample
return self.compute(data, cache_id=cache_id, **kwargs)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1048, in compute
d_arr = self.expand_reduce(data.data, repeats)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 1014, in expand_reduce
return cls.aggregate(d_arr, y_size, x_size)
File "/tcenas/home/andream/code/satpy_latest/satpy/satpy/resample.py", line 976, in aggregate
raise ValueError("Aggregation requires arrays with "
ValueError: Aggregation requires arrays with shapes and chunks divisible by the factor
Additional context
There's currently a bug in Satpy master that reverts the coarsest/finest logic for the original upside-down left-right flipped SEVIRI data. Using the upper_right_corner='NE'
load kwarg, the image is flipped upright and the bug is avoided. Without the kwarg, this issue is replicated by using finest_area
instead of coarsest_area
(wrong behaviour). See https://github.com/pytroll/satpy/pull/1596
The error comes from here: https://github.com/pytroll/satpy/blob/c74c396d402d56a6fdf29e141584554a531f9bc6/satpy/resample.py#L969-L974
The chunk sizes are for HRIT:
>>> global_scene['HRV'].chunks
((464, 464, 464, 464, 464, 464, 225, 239, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464, 464), (1472, 1235, 2861, 2707, 2861))
>>> global_scene['IR_108'].chunks
((464, 464, 464, 464, 464, 464, 464, 464), (3712,))
for native:
scn['HRV'].chunks
Out[2]: ((3711, 385, 3326, 770, 64, 2880), (2863, 2705, 2863, 1233, 1472))
and some are not divisible by 3 (agg_size
).
@djhoese @TAlonglong FYI
Thank you for investigating on this @ameraner ! For the hrit, iirc it is implemented so that a chunks should be exactly one file in principle, and for hrv the file containing the upper-lower window split is a bit different. So maybe it's as easy as rechunking to make things compatible between hrv and non-hrv? eg rechunking the disk-filled hrv to (4643, 37123)?
OK, I did some playing with this with HRIT data and I got it to work with this diff:
diff --git a/satpy/resample.py b/satpy/resample.py
index 6cf7d009..15e20172 100644
--- a/satpy/resample.py
+++ b/satpy/resample.py
@@ -968,6 +968,9 @@ class NativeResampler(BaseResampler):
# it should be reshaped to do the averaging.
raise ValueError("Can't aggregrate (reduce) data arrays with "
"more than 2 dimensions.")
+ if d.shape == (11136, 11136):
+ print("Is hrv")
+ d = d.rechunk((5568, 5568))
if not (x_size.is_integer() and y_size.is_integer()):
raise ValueError("Aggregation factors are not integers")
for agg_size, chunks in zip([y_size, x_size], d.chunks):
So indeed a hack. Not sure if my output is correct, need to check further. So I'm not sure this will solve our problem.
Is it possible to know the chunk size of the non-HRV channels inside the file handler (or reader) and rechunk the data relative to that?
in the reader maybe. Given that non-HRV channels are loaded at the same time as the HRV...
This also affects FCI:
import hdf5plugin
import pathlib
import os
from satpy import Scene
from glob import glob
bd = pathlib.Path("/media/nas/x21308/MTG_test_data/2022_05_MTG_Testdata/RC0040/")
sc = Scene(filenames=[os.fspath(f) for f in bd.glob("*BODY*.nc")],reader="fci_l1c_nc")
sc.load(["vis_04", "ir_133", "true_color_raw", "convection"], upper_right_corner="NE")
ls = sc.resample(sc.coarsest_area(), resampler="native")
fails with
Traceback (most recent call last):
File "/data/gholl/checkouts/protocode/fci-resample.py", line 9, in <module>
ls = sc.resample(sc.coarsest_area(), resampler="native")
File "/data/gholl/checkouts/satpy/satpy/scene.py", line 955, in resample
self._resampled_scene(new_scn, destination, resampler=resampler,
File "/data/gholl/checkouts/satpy/satpy/scene.py", line 871, in _resampled_scene
res = resample_dataset(dataset, destination_area, **kwargs)
File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1418, in resample_dataset
new_data = resample(source_area, dataset, destination_area, fill_value=fill_value, **kwargs)
File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1381, in resample
res = resampler_instance.resample(data, **kwargs)
File "/data/gholl/checkouts/satpy/satpy/resample.py", line 981, in resample
return super(NativeResampler, self).resample(data,
File "/data/gholl/checkouts/satpy/satpy/resample.py", line 439, in resample
return self.compute(data, cache_id=cache_id, **kwargs)
File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1084, in compute
d_arr = self._expand_reduce(data.data, repeats)
File "/data/gholl/checkouts/satpy/satpy/resample.py", line 1051, in _expand_reduce
return cls._aggregate(d_arr, y_size, x_size)
File "/data/gholl/checkouts/satpy/satpy/resample.py", line 999, in _aggregate
raise ValueError("Aggregation requires arrays with "
ValueError: Aggregation requires arrays with shapes and chunks divisible by the factor
I have not tested if it also affects ABI.
I just fixed some of these issues in #2291. If the math works for the overall array shapes then the arrays are rechunked as necessary. I think that means this is resolved and should work as expected if I'm understanding the issue completely.
I'm going to close this as I think it is fixed. Please reopen if it is still an issue.