satpy icon indicating copy to clipboard operation
satpy copied to clipboard

early compute in Filler and co if coordinates are daskified

Open gerritholl opened this issue 2 months ago • 3 comments

Describe the bug

If we pass DataArrays with daskified coordinates to Filler, MultiFiller, FillingCompositor and co, this leads to a compute during the compositor call.

To Reproduce

# Your code here
import xarray as xr
import dask.config
import dask.array as da
from satpy.composites.fill import Filler
from satpy import DataQuery
from satpy.area import get_area_def
from satpy.tests.utils import CustomScheduler

comp = Filler("mfc", ["prim", "sec"])
ar = get_area_def("eurol")
(lons, lats) = ar.get_lonlats(chunks=ar.shape)
bb = xr.DataArray(da.zeros(shape=ar.shape), dims=("y", "x"), attrs={"area": ar}, coords={"longitude": (("y", "x"), lons), "latitude": (("y", "x"), lats), "crs": ar.crs})
(lons, lats) = ar.get_lonlats(chunks=(4096, 4096))
aa = xr.DataArray(da.ones(shape=ar.shape), dims=("y", "x"), attrs={"area": ar}, coords={"longitude": (("y", "x"), lons), "latitude": (("y", "x"), lats), "crs": ar.crs})
with dask.config.set(scheduler=CustomScheduler(max_computes=0)):
    comp([aa, bb])

Expected behavior

I expect no compute.

Actual results

Failure with RuntimeError: Too many dask computations were scheduled: 1:

Traceback (most recent call last):
  File "/home/gholl/checkouts/protocode/mwe/multifiller-compute.py", line 16, in <module>
    comp([aa, bb])
    ~~~~^^^^^^^^^^
  File "/home/gholl/checkouts/satpy/satpy/composites/fill.py", line 58, in __call__
    filled_projectable = projectables[0].fillna(projectables[1])
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/core/dataarray.py", line 3552, in fillna
    out = ops.fillna(self, value)
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/computation/ops.py", line 152, in fillna
    return apply_ufunc(
        duck_array_ops.fillna,
    ...<6 lines>...
        keep_attrs=True,
    )
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 1267, in apply_ufunc
    return apply_dataarray_vfunc(
        variables_vfunc,
    ...<4 lines>...
        keep_attrs=keep_attrs,
    )
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 305, in apply_dataarray_vfunc
    result_coords, result_indexes = build_output_coords_and_indexes(
                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        args, signature, exclude_dims, combine_attrs=keep_attrs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 250, in build_output_coords_and_indexes
    merged_vars, merged_indexes = merge_coordinates_without_align(
                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        coords_list, exclude_dims=exclude_dims, combine_attrs=combine_attrs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/structure/merge.py", line 458, in merge_coordinates_without_align
    merged_coords, merged_indexes = merge_collected(
                                    ~~~~~~~~~~~~~~~^
        filtered, prioritized, combine_attrs=combine_attrs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/structure/merge.py", line 315, in merge_collected
    equals_this_var, merged_vars[name] = unique_variable(
                                         ~~~~~~~~~~~~~~~^
        name, variables, compat, equals.get(name)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/structure/merge.py", line 161, in unique_variable
    out = out.compute()
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/core/variable.py", line 1077, in compute
    return new.load(**kwargs)
           ~~~~~~~~^^^^^^^^^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/core/variable.py", line 1011, in load
    self._data = to_duck_array(self._data, **kwargs)
                 ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/namedarray/pycompat.py", line 139, in to_duck_array
    loaded_data, *_ = chunkmanager.compute(data, **kwargs)  # type: ignore[var-annotated]
                      ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/xarray/namedarray/daskmanager.py", line 85, in compute
    return compute(*data, **kwargs)  # type: ignore[no-untyped-call, no-any-return]
  File "/home/gholl/miniforge3/envs/py313/lib/python3.13/site-packages/dask/base.py", line 681, in compute
    results = schedule(expr, keys, **kwargs)
  File "/home/gholl/checkouts/satpy/satpy/tests/utils.py", line 293, in __call__
    raise RuntimeError("Too many dask computations were scheduled: "
                       "{}".format(self.total_computes))
RuntimeError: Too many dask computations were scheduled: 1

Environment Info:

  • Satpy Version: v0.58.0-151-g6d843f67a (main branch)

Additional context

One way to get DataArrays with daskified coordinates is when writing NetCDF files using the satpy cf writer with include_lonlats=True (the default), then reading them with the satpy cf reader — the DataArray will have lat/lon coordinates as dask arrays.

gerritholl avatar Oct 20 '25 16:10 gerritholl

Do I remember correctly that this at least was what xarray did in the past if the coordinates were dask instead of Numpy 🤔

pnuu avatar Oct 20 '25 16:10 pnuu

Yes, that's what I remember.

djhoese avatar Oct 20 '25 16:10 djhoese

Maybe not much we can do about it in satpy...

gerritholl avatar Oct 21 '25 12:10 gerritholl