datacube-core
datacube-core copied to clipboard
Using dask.distributed.Client to load using dc.load is failing with a mysterious Pickle error
I'm doing a fairly simple load, like this:
data = dc.load(
product="s2_l2a",
measurements=["red", "green", "blue"],
output_crs="EPSG:32750",
resolution=10,
time=("2025-03-01", "2025-03-14"),
x=(float(bbox.left), float(bbox.right)),
y=(float(bbox.bottom), float(bbox.top)),
dask_chunks={},
group_by="solar_day",
)
data
and it's failing with the error below. If I remove dask, and just load direct, it works fine. It works fine using dask without the dask.distributed.Client too.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File /opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py:60, in dumps(x, buffer_callback, protocol)
59 try:
---> 60 result = pickle.dumps(x, **dump_kwargs)
61 except Exception:
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
File /opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py:65, in dumps(x, buffer_callback, protocol)
64 buffers.clear()
---> 65 pickler.dump(x)
66 result = f.getvalue()
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
File /opt/homebrew/lib/python3.11/site-packages/distributed/protocol/serialize.py:366, in serialize(x, serializers, on_error, context, iterate_collection)
365 try:
--> 366 header, frames = dumps(x, context=context) if wants_context else dumps(x)
367 header["serializer"] = name
File /opt/homebrew/lib/python3.11/site-packages/distributed/protocol/serialize.py:78, in pickle_dumps(x, context)
76 writeable.append(not f.readonly)
---> 78 frames[0] = pickle.dumps(
79 x,
80 buffer_callback=buffer_callback,
81 protocol=context.get("pickle-protocol", None) if context else None,
82 )
83 header = {
84 "serializer": "pickle",
85 "writeable": tuple(writeable),
86 }
File /opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py:77, in dumps(x, buffer_callback, protocol)
76 buffers.clear()
---> 77 result = cloudpickle.dumps(x, **dump_kwargs)
78 except Exception:
File /opt/homebrew/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1537, in dumps(obj, protocol, buffer_callback)
1536 cp = Pickler(file, protocol=protocol, buffer_callback=buffer_callback)
-> 1537 cp.dump(obj)
1538 return file.getvalue()
File /opt/homebrew/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1303, in Pickler.dump(self, obj)
1302 try:
-> 1303 return super().dump(obj)
1304 except RuntimeError as e:
TypeError: cannot pickle 'weakref.ReferenceType' object
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
Cell In[7], line 1
----> 1 data = data.compute()
File /opt/homebrew/lib/python3.11/site-packages/xarray/core/dataset.py:714, in Dataset.compute(self, **kwargs)
690 """Manually trigger loading and/or computation of this dataset's data
691 from disk or a remote source into memory and return a new dataset.
692 Unlike load, the original dataset is left unaltered.
(...) 711 dask.compute
712 """
713 new = self.copy(deep=False)
--> 714 return new.load(**kwargs)
File /opt/homebrew/lib/python3.11/site-packages/xarray/core/dataset.py:541, in Dataset.load(self, **kwargs)
538 chunkmanager = get_chunked_array_type(*lazy_data.values())
540 # evaluate all the chunked arrays simultaneously
--> 541 evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
542 *lazy_data.values(), **kwargs
543 )
545 for k, data in zip(lazy_data, evaluated_data, strict=False):
546 self.variables[k].data = data
File /opt/homebrew/lib/python3.11/site-packages/xarray/namedarray/daskmanager.py:85, in DaskManager.compute(self, *data, **kwargs)
80 def compute(
81 self, *data: Any, **kwargs: Any
82 ) -> tuple[np.ndarray[Any, _DType_co], ...]:
83 from dask.array import compute
---> 85 return compute(*data, **kwargs)
File /opt/homebrew/lib/python3.11/site-packages/dask/base.py:681, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
678 expr = expr.optimize()
679 keys = list(flatten(expr.__dask_keys__()))
--> 681 results = schedule(expr, keys, **kwargs)
683 return repack(results)
File /opt/homebrew/lib/python3.11/site-packages/distributed/protocol/serialize.py:284, in serialize(x, serializers, on_error, context, iterate_collection)
282 return x.header, x.frames
283 if isinstance(x, Serialize):
--> 284 return serialize(
285 x.data,
286 serializers=serializers,
287 on_error=on_error,
288 context=context,
289 iterate_collection=True,
290 )
292 # Note: don't use isinstance(), as it would match subclasses
293 # (e.g. namedtuple, defaultdict) which however would revert to the base class on a
294 # round-trip through msgpack
295 if iterate_collection is None and type(x) in (list, set, tuple, dict):
File /opt/homebrew/lib/python3.11/site-packages/distributed/protocol/serialize.py:391, in serialize(x, serializers, on_error, context, iterate_collection)
389 str_x = str(x)[:10000]
390 except Exception:
--> 391 raise TypeError(msg) from exc
392 raise TypeError(msg, str_x) from exc
393 else: # pragma: nocover
TypeError: Could not serialize object of type _HLGExprSequence
Version of software are:
Datacube version: 1.9.2 Dask version: 2025.5.0 Dask distributed version: 2025.5.0
I'm pretty sure this is one of two known Dask issues in 1.9.[0-2] - one was fixed in 1.9.3, the other is fixed in develop and will be included in the 1.9.4 release, which I will hopefully get out soonish.
Almost certainly fixed by this: https://github.com/opendatacube/datacube-core/pull/1776 (which is not released yet)
Fixed in 1.9.3
This is still causing issues with a fairly simple load using Dask. Versions are as follows:
Python : 3.11.11
datacube : 1.9.6
dask : 2025.7.0
distributed : 2025.7.0
Minimal reproducible example is like this (should work with any product):
from datacube import Datacube
from dask.distributed import Client
dc = Datacube(app="test")
client = Client(threads_per_worker=8, n_workers=2)
datasets = dc.find_datasets(
product="ls9_c2l2_sr",
limit=1
)
data = dc.load(
datasets=datasets,
measurements=["red"],
output_crs="EPSG:4326",
resolution=0.0001,
dask_chunks={"time": 1, "longitude": 1000, "latitude": 1000},
)
data.compute()
Stack trace:
[/opt/homebrew/lib/python3.11/site-packages/distributed/node.py:187](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/distributed/node.py:187): UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 57294 instead
warnings.warn(
Querying product Product(name='ls9_c2l2_sr', id_=1)
--- Logging error ---
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 63, in dumps
result = pickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 68, in dumps
pickler.dump(x)
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 80, in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
cp.dump(obj)
File "/opt/homebrew/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
return super().dump(obj)
^^^^^^^^^^^^^^^^^
TypeError: cannot pickle 'weakref.ReferenceType' object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/logging/__init__.py", line 1110, in emit
msg = self.format(record)
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/logging/__init__.py", line 953, in format
return fmt.format(record)
^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/logging/__init__.py", line 687, in format
record.message = record.getMessage()
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/logging/__init__.py", line 377, in getMessage
msg = msg % self.args
~~~~^~~~~~~~~~~
File "/opt/homebrew/lib/python3.11/site-packages/dask/_expr.py", line 90, in __str__
s = ", ".join(self._operands_for_repr())
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/dask/_expr.py", line 1097, in _operands_for_repr
f"name={self.operand('name')!r}",
^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/dask/_expr.py", line 224, in operand
return self.operands[type(self)._parameters.index(key)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: 'name' is not in list
Call stack:
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel_launcher.py", line 18, in <module>
app.launch_new_instance()
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/traitlets/config/application.py", line 1075, in launch_instance
app.start()
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/kernelapp.py", line 739, in start
self.io_loop.start()
File "/opt/homebrew/lib/python3.11/site-packages/tornado/platform/asyncio.py", line 205, in start
self.asyncio_loop.run_forever()
File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/opt/homebrew/Cellar/[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/kernelbase.py", line 545, in dispatch_queue
await self.process_one()
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/kernelbase.py", line 534, in process_one
await dispatch(*args)
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/kernelbase.py", line 437, in dispatch_shell
await result
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/ipkernel.py", line 362, in execute_request
await super().execute_request(stream, ident, parent)
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/kernelbase.py", line 778, in execute_request
reply_content = await reply_content
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/ipkernel.py", line 449, in do_execute
res = shell.run_cell(
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/ipykernel/zmqshell.py", line 549, in run_cell
return super().run_cell(*args, **kwargs)
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/IPython/core/interactiveshell.py", line 3047, in run_cell
result = self._run_cell(
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/IPython/core/interactiveshell.py", line 3102, in _run_cell
result = runner(coro)
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
coro.send(None)
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/IPython/core/interactiveshell.py", line 3306, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/IPython/core/interactiveshell.py", line 3489, in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "/Users/alex/Library/Python/3.11/lib/python/site-packages/IPython/core/interactiveshell.py", line 3549, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/var/folders/0h/8_60bs_52tx9q2mdgszb37gw0000gn/T/ipykernel_79213/605807411.py", line 21, in <module>
data.compute()
File "/opt/homebrew/lib/python3.11/site-packages/xarray/core/dataset.py", line 714, in compute
return new.load(**kwargs)
File "/opt/homebrew/lib/python3.11/site-packages/xarray/core/dataset.py", line 541, in load
evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
File "/opt/homebrew/lib/python3.11/site-packages/xarray/namedarray/daskmanager.py", line 85, in compute
return compute(*data, **kwargs) # type: ignore[no-untyped-call, no-any-return]
File "/opt/homebrew/lib/python3.11/site-packages/dask/base.py", line 681, in compute
results = schedule(expr, keys, **kwargs)
File "/opt/homebrew/lib/python3.11/site-packages/distributed/client.py", line 3475, in get
futures = self._graph_to_futures(
File "/opt/homebrew/lib/python3.11/site-packages/distributed/client.py", line 3365, in _graph_to_futures
expr_ser = Serialized(*serialize(to_serialize(expr), on_error="raise"))
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/serialize.py", line 284, in serialize
return serialize(
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/serialize.py", line 366, in serialize
header, frames = dumps(x, context=context) if wants_context else dumps(x)
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/serialize.py", line 78, in pickle_dumps
frames[0] = pickle.dumps(
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 82, in dumps
logger.exception("Failed to serialize %s.", x)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
Tried using older dask and distributed (2024.6.0) and get this error:
Querying product Product(name='ls9_c2l2_sr', id_=1)
2025-07-30 12:35:45,746 - distributed.protocol.pickle - ERROR - Failed to serialize <ToPickle: HighLevelGraph with 1 layers.
<dask.highlevelgraph.HighLevelGraph object at 0x140f37090>
0. 5381305280
>.
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 63, in dumps
result = pickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 68, in dumps
pickler.dump(x)
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 81, in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
cp.dump(obj)
File "/opt/homebrew/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
return super().dump(obj)
^^^^^^^^^^^^^^^^^
TypeError: cannot pickle 'weakref.ReferenceType' object
Reproduced in another environment with the following versions:
Python : 3.12.10
datacube : 1.9.6
dask : 2025.5.1
distributed : 2025.5.1
Traceback:
[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/node.py:187](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/node.py#line=186): UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 43519 instead
warnings.warn(
Querying product Product(name='ls9_c2l2_sr', id_=1)
--- Logging error ---
Traceback (most recent call last):
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py", line 60](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py#line=59), in dumps
result = pickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: Can't get local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py", line 65](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py#line=64), in dumps
pickler.dump(x)
AttributeError: Can't get local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py", line 77](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py#line=76), in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py", line 1537](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py#line=1536), in dumps
cp.dump(obj)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py", line 1303](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py#line=1302), in dump
return super().dump(obj)
^^^^^^^^^^^^^^^^^
TypeError: cannot pickle 'weakref.ReferenceType' object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py", line 1160](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py#line=1159), in emit
msg = self.format(record)
^^^^^^^^^^^^^^^^^^^
File "[/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py", line 999](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py#line=998), in format
return fmt.format(record)
^^^^^^^^^^^^^^^^^^
File "[/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py", line 703](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py#line=702), in format
record.message = record.getMessage()
^^^^^^^^^^^^^^^^^^^
File "[/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py", line 392](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/logging/__init__.py#line=391), in getMessage
msg = msg % self.args
~~~~^~~~~~~~~~~
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/_expr.py", line 90](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/_expr.py#line=89), in __str__
s = ", ".join(self._operands_for_repr())
^^^^^^^^^^^^^^^^^^^^^^^^^
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/_expr.py", line 1094](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/_expr.py#line=1093), in _operands_for_repr
f"name={self.operand('name')!r}",
^^^^^^^^^^^^^^^^^^^^
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/_expr.py", line 221](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/_expr.py#line=220), in operand
return self.operands[type(self)._parameters.index(key)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: 'name' is not in list
Call stack:
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel_launcher.py", line 18](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel_launcher.py#line=17), in <module>
app.launch_new_instance()
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/traitlets/config/application.py", line 1075](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/traitlets/config/application.py#line=1074), in launch_instance
app.start()
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelapp.py", line 739](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelapp.py#line=738), in start
self.io_loop.start()
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/tornado/platform/asyncio.py", line 211](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/tornado/platform/asyncio.py#line=210), in start
self.asyncio_loop.run_forever()
File "[/srv/conda/envs/notebook/lib/python3.12/asyncio/base_events.py", line 645](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/asyncio/base_events.py#line=644), in run_forever
self._run_once()
File "[/srv/conda/envs/notebook/lib/python3.12/asyncio/base_events.py", line 1999](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/asyncio/base_events.py#line=1998), in _run_once
handle._run()
File "[/srv/conda/envs/notebook/lib/python3.12/asyncio/events.py", line 88](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/asyncio/events.py#line=87), in _run
self._context.run(self._callback, *self._args)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py", line 545](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py#line=544), in dispatch_queue
await self.process_one()
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py", line 534](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py#line=533), in process_one
await dispatch(*args)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py", line 437](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py#line=436), in dispatch_shell
await result
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/ipkernel.py", line 362](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/ipkernel.py#line=361), in execute_request
await super().execute_request(stream, ident, parent)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py", line 778](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/kernelbase.py#line=777), in execute_request
reply_content = await reply_content
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/ipkernel.py", line 449](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/ipkernel.py#line=448), in do_execute
res = shell.run_cell(
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/zmqshell.py", line 549](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/ipykernel/zmqshell.py#line=548), in run_cell
return super().run_cell(*args, **kwargs)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3046](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py#line=3045), in run_cell
result = self._run_cell(
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3101](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py#line=3100), in _run_cell
result = runner(coro)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 129](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/async_helpers.py#line=128), in _pseudo_sync_runner
coro.send(None)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3306](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py#line=3305), in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3488](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py#line=3487), in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3548](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/IPython/core/interactiveshell.py#line=3547), in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "[/tmp/ipykernel_110/901322381.py", line 21](https://sandbox.staging.pik-sel.id/tmp/ipykernel_110/901322381.py#line=20), in <module>
data.compute()
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py", line 714](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py#line=713), in compute
return new.load(**kwargs)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py", line 541](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py#line=540), in load
evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py", line 85](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py#line=84), in compute
return compute(*data, **kwargs) # type: ignore[no-untyped-call, no-any-return]
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/base.py", line 681](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/base.py#line=680), in compute
results = schedule(expr, keys, **kwargs)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/client.py", line 3467](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/client.py#line=3466), in get
futures = self._graph_to_futures(
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/client.py", line 3357](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/client.py#line=3356), in _graph_to_futures
expr_ser = Serialized(*serialize(to_serialize(expr), on_error="raise"))
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py", line 284](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py#line=283), in serialize
return serialize(
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py", line 366](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py#line=365), in serialize
header, frames = dumps(x, context=context) if wants_context else dumps(x)
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py", line 78](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py#line=77), in pickle_dumps
frames[0] = pickle.dumps(
File "[/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py", line 79](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py#line=78), in dumps
logger.exception("Failed to serialize %s.", x)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py:60](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py#line=59), in dumps(x, buffer_callback, protocol)
59 try:
---> 60 result = pickle.dumps(x, **dump_kwargs)
61 except Exception:
AttributeError: Can't get local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py:65](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py#line=64), in dumps(x, buffer_callback, protocol)
64 buffers.clear()
---> 65 pickler.dump(x)
66 result = f.getvalue()
AttributeError: Can't get local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py:366](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py#line=365), in serialize(x, serializers, on_error, context, iterate_collection)
365 try:
--> 366 header, frames = dumps(x, context=context) if wants_context else dumps(x)
367 header["serializer"] = name
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py:78](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py#line=77), in pickle_dumps(x, context)
76 writeable.append(not f.readonly)
---> 78 frames[0] = pickle.dumps(
79 x,
80 buffer_callback=buffer_callback,
81 protocol=context.get("pickle-protocol", None) if context else None,
82 )
83 header = {
84 "serializer": "pickle",
85 "writeable": tuple(writeable),
86 }
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py:77](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/pickle.py#line=76), in dumps(x, buffer_callback, protocol)
76 buffers.clear()
---> 77 result = cloudpickle.dumps(x, **dump_kwargs)
78 except Exception:
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py:1537](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py#line=1536), in dumps(obj, protocol, buffer_callback)
1536 cp = Pickler(file, protocol=protocol, buffer_callback=buffer_callback)
-> 1537 cp.dump(obj)
1538 return file.getvalue()
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py:1303](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/cloudpickle/cloudpickle.py#line=1302), in Pickler.dump(self, obj)
1302 try:
-> 1303 return super().dump(obj)
1304 except RuntimeError as e:
TypeError: cannot pickle 'weakref.ReferenceType' object
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
Cell In[1], line 21
8 datasets = dc.find_datasets(
9 product="ls9_c2l2_sr",
10 limit=1
11 )
13 data = dc.load(
14 datasets=datasets,
15 measurements=["red"],
(...)
18 dask_chunks={"time": 1, "longitude": 1000, "latitude": 1000},
19 )
---> 21 data.compute()
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py:714](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py#line=713), in Dataset.compute(self, **kwargs)
690 """Manually trigger loading and[/or](https://sandbox.staging.pik-sel.id/or) computation of this dataset's data
691 from disk or a remote source into memory and return a new dataset.
692 Unlike load, the original dataset is left unaltered.
(...)
711 dask.compute
712 """
713 new = self.copy(deep=False)
--> 714 return new.load(**kwargs)
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py:541](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/core/dataset.py#line=540), in Dataset.load(self, **kwargs)
538 chunkmanager = get_chunked_array_type(*lazy_data.values())
540 # evaluate all the chunked arrays simultaneously
--> 541 evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
542 *lazy_data.values(), **kwargs
543 )
545 for k, data in zip(lazy_data, evaluated_data, strict=False):
546 self.variables[k].data = data
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py:85](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py#line=84), in DaskManager.compute(self, *data, **kwargs)
80 def compute(
81 self, *data: Any, **kwargs: Any
82 ) -> tuple[np.ndarray[Any, _DType_co], ...]:
83 from dask.array import compute
---> 85 return compute(*data, **kwargs)
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/base.py:681](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/dask/base.py#line=680), in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
678 expr = expr.optimize()
679 keys = list(flatten(expr.__dask_keys__()))
--> 681 results = schedule(expr, keys, **kwargs)
683 return repack(results)
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py:284](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py#line=283), in serialize(x, serializers, on_error, context, iterate_collection)
282 return x.header, x.frames
283 if isinstance(x, Serialize):
--> 284 return serialize(
285 x.data,
286 serializers=serializers,
287 on_error=on_error,
288 context=context,
289 iterate_collection=True,
290 )
292 # Note: don't use isinstance(), as it would match subclasses
293 # (e.g. namedtuple, defaultdict) which however would revert to the base class on a
294 # round-trip through msgpack
295 if iterate_collection is None and type(x) in (list, set, tuple, dict):
File [/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py:391](https://sandbox.staging.pik-sel.id/srv/conda/envs/notebook/lib/python3.12/site-packages/distributed/protocol/serialize.py#line=390), in serialize(x, serializers, on_error, context, iterate_collection)
389 str_x = str(x)[:10000]
390 except Exception:
--> 391 raise TypeError(msg) from exc
392 raise TypeError(msg, str_x) from exc
393 else: # pragma: nocover
TypeError: Could not serialize object of type _HLGExprSequence
I can't reproduce on DEA Sandbox with:
Python : 3.10.15
datacube : 1.9.6
dask : 2025.7.0
distributed : 2025.7.0
Ok, trying with a new (old, 3.10) environment:
Python : 3.10.12
datacube : 1.9.6
dask : 2025.7.0
distributed : 2025.7.0
xarray : 2025.6.1
rasterio : 1.4.3
--- Logging error ---
Traceback (most recent call last):
File "[/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 63](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py#line=62), in dumps
result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 68](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py#line=67), in dumps
pickler.dump(x)
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 80](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py#line=79), in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
File "[/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py", line 1537](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py#line=1536), in dumps
cp.dump(obj)
File "[/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py", line 1303](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py#line=1302), in dump
return super().dump(obj)
TypeError: cannot pickle 'weakref.ReferenceType' object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/usr/lib/python3.10/logging/__init__.py", line 1100](http://localhost:8888/lab/tree/usr/lib/python3.10/logging/__init__.py#line=1099), in emit
msg = self.format(record)
File "[/usr/lib/python3.10/logging/__init__.py", line 943](http://localhost:8888/lab/tree/usr/lib/python3.10/logging/__init__.py#line=942), in format
return fmt.format(record)
File "[/usr/lib/python3.10/logging/__init__.py", line 678](http://localhost:8888/lab/tree/usr/lib/python3.10/logging/__init__.py#line=677), in format
record.message = record.getMessage()
File "[/usr/lib/python3.10/logging/__init__.py", line 368](http://localhost:8888/lab/tree/usr/lib/python3.10/logging/__init__.py#line=367), in getMessage
msg = msg % self.args
File "[/usr/local/lib/python3.10/dist-packages/dask/_expr.py", line 90](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/dask/_expr.py#line=89), in __str__
s = ", ".join(self._operands_for_repr())
File "[/usr/local/lib/python3.10/dist-packages/dask/_expr.py", line 1097](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/dask/_expr.py#line=1096), in _operands_for_repr
f"name={self.operand('name')!r}",
File "[/usr/local/lib/python3.10/dist-packages/dask/_expr.py", line 224](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/dask/_expr.py#line=223), in operand
return self.operands[type(self)._parameters.index(key)]
ValueError: 'name' is not in list
Call stack:
File "[/usr/lib/python3.10/runpy.py", line 196](http://localhost:8888/lab/tree/usr/lib/python3.10/runpy.py#line=195), in _run_module_as_main
return _run_code(code, main_globals, None,
File "[/usr/lib/python3.10/runpy.py", line 86](http://localhost:8888/lab/tree/usr/lib/python3.10/runpy.py#line=85), in _run_code
exec(code, run_globals)
File "[/usr/local/lib/python3.10/dist-packages/ipykernel_launcher.py", line 18](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel_launcher.py#line=17), in <module>
app.launch_new_instance()
File "[/usr/local/lib/python3.10/dist-packages/traitlets/config/application.py", line 1075](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/traitlets/config/application.py#line=1074), in launch_instance
app.start()
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/kernelapp.py", line 739](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/kernelapp.py#line=738), in start
self.io_loop.start()
File "[/usr/local/lib/python3.10/dist-packages/tornado/platform/asyncio.py", line 211](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/tornado/platform/asyncio.py#line=210), in start
self.asyncio_loop.run_forever()
File "[/usr/lib/python3.10/asyncio/base_events.py", line 603](http://localhost:8888/lab/tree/usr/lib/python3.10/asyncio/base_events.py#line=602), in run_forever
self._run_once()
File "[/usr/lib/python3.10/asyncio/base_events.py", line 1909](http://localhost:8888/lab/tree/usr/lib/python3.10/asyncio/base_events.py#line=1908), in _run_once
handle._run()
File "[/usr/lib/python3.10/asyncio/events.py", line 80](http://localhost:8888/lab/tree/usr/lib/python3.10/asyncio/events.py#line=79), in _run
self._context.run(self._callback, *self._args)
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py", line 516](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py#line=515), in dispatch_queue
await self.process_one()
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py", line 505](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py#line=504), in process_one
await dispatch(*args)
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py", line 397](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py#line=396), in dispatch_shell
await result
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py", line 368](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py#line=367), in execute_request
await super().execute_request(stream, ident, parent)
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py", line 752](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/kernelbase.py#line=751), in execute_request
reply_content = await reply_content
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py", line 455](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/ipkernel.py#line=454), in do_execute
res = shell.run_cell(
File "[/usr/local/lib/python3.10/dist-packages/ipykernel/zmqshell.py", line 577](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/ipykernel/zmqshell.py#line=576), in run_cell
return super().run_cell(*args, **kwargs)
File "[/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3077](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py#line=3076), in run_cell
result = self._run_cell(
File "[/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3132](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py#line=3131), in _run_cell
result = runner(coro)
File "[/usr/local/lib/python3.10/dist-packages/IPython/core/async_helpers.py", line 128](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/IPython/core/async_helpers.py#line=127), in _pseudo_sync_runner
coro.send(None)
File "[/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3336](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py#line=3335), in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "[/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3519](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py#line=3518), in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "[/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3579](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py#line=3578), in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_26/3616026081.py", line 21, in <module>
data.compute()
File "[/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py", line 715](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py#line=714), in compute
return new.load(**kwargs)
File "[/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py", line 542](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py#line=541), in load
evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
File "[/usr/local/lib/python3.10/dist-packages/xarray/namedarray/daskmanager.py", line 85](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/xarray/namedarray/daskmanager.py#line=84), in compute
return compute(*data, **kwargs) # type: ignore[no-untyped-call, no-any-return]
File "[/usr/local/lib/python3.10/dist-packages/dask/base.py", line 681](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/dask/base.py#line=680), in compute
results = schedule(expr, keys, **kwargs)
File "[/usr/local/lib/python3.10/dist-packages/distributed/client.py", line 3475](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/client.py#line=3474), in get
futures = self._graph_to_futures(
File "[/usr/local/lib/python3.10/dist-packages/distributed/client.py", line 3365](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/client.py#line=3364), in _graph_to_futures
expr_ser = Serialized(*serialize(to_serialize(expr), on_error="raise"))
File "[/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py", line 284](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py#line=283), in serialize
return serialize(
File "[/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py", line 366](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py#line=365), in serialize
header, frames = dumps(x, context=context) if wants_context else dumps(x)
File "[/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py", line 78](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py#line=77), in pickle_dumps
frames[0] = pickle.dumps(
File "[/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 82](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py#line=81), in dumps
logger.exception("Failed to serialize %s.", x)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File [/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py:63](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py#line=62), in dumps(x, buffer_callback, protocol)
62 try:
---> 63 result = pickle.dumps(x, **dump_kwargs)
64 except Exception:
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
File [/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py:68](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py#line=67), in dumps(x, buffer_callback, protocol)
67 buffers.clear()
---> 68 pickler.dump(x)
69 result = f.getvalue()
AttributeError: Can't pickle local object 'Mapper.__init__.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
File [/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py:366](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py#line=365), in serialize(x, serializers, on_error, context, iterate_collection)
365 try:
--> 366 header, frames = dumps(x, context=context) if wants_context else dumps(x)
367 header["serializer"] = name
File [/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py:78](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py#line=77), in pickle_dumps(x, context)
76 writeable.append(not f.readonly)
---> 78 frames[0] = pickle.dumps(
79 x,
80 buffer_callback=buffer_callback,
81 protocol=context.get("pickle-protocol", None) if context else None,
82 )
83 header = {
84 "serializer": "pickle",
85 "writeable": tuple(writeable),
86 }
File [/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py:80](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py#line=79), in dumps(x, buffer_callback, protocol)
79 buffers.clear()
---> 80 result = cloudpickle.dumps(x, **dump_kwargs)
81 except Exception:
File [/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py:1537](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py#line=1536), in dumps(obj, protocol, buffer_callback)
1536 cp = Pickler(file, protocol=protocol, buffer_callback=buffer_callback)
-> 1537 cp.dump(obj)
1538 return file.getvalue()
File [/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py:1303](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py#line=1302), in Pickler.dump(self, obj)
1302 try:
-> 1303 return super().dump(obj)
1304 except RuntimeError as e:
TypeError: cannot pickle 'weakref.ReferenceType' object
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
Cell In[9], line 21
8 datasets = dc.find_datasets(
9 product="s2_l2a",
10 limit=1
11 )
13 data = dc.load(
14 datasets=datasets,
15 measurements=["red"],
(...)
18 dask_chunks={"time": 1, "longitude": 1000, "latitude": 1000},
19 )
---> 21 data.compute()
File [/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py:715](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py#line=714), in Dataset.compute(self, **kwargs)
691 """Manually trigger loading and/or computation of this dataset's data
692 from disk or a remote source into memory and return a new dataset.
693 Unlike load, the original dataset is left unaltered.
(...)
712 dask.compute
713 """
714 new = self.copy(deep=False)
--> 715 return new.load(**kwargs)
File [/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py:542](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/xarray/core/dataset.py#line=541), in Dataset.load(self, **kwargs)
539 chunkmanager = get_chunked_array_type(*lazy_data.values())
541 # evaluate all the chunked arrays simultaneously
--> 542 evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
543 *lazy_data.values(), **kwargs
544 )
546 for k, data in zip(lazy_data, evaluated_data, strict=False):
547 self.variables[k].data = data
File [/usr/local/lib/python3.10/dist-packages/xarray/namedarray/daskmanager.py:85](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/xarray/namedarray/daskmanager.py#line=84), in DaskManager.compute(self, *data, **kwargs)
80 def compute(
81 self, *data: Any, **kwargs: Any
82 ) -> tuple[np.ndarray[Any, _DType_co], ...]:
83 from dask.array import compute
---> 85 return compute(*data, **kwargs)
File [/usr/local/lib/python3.10/dist-packages/dask/base.py:681](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/dask/base.py#line=680), in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
678 expr = expr.optimize()
679 keys = list(flatten(expr.__dask_keys__()))
--> 681 results = schedule(expr, keys, **kwargs)
683 return repack(results)
File [/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py:284](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py#line=283), in serialize(x, serializers, on_error, context, iterate_collection)
282 return x.header, x.frames
283 if isinstance(x, Serialize):
--> 284 return serialize(
285 x.data,
286 serializers=serializers,
287 on_error=on_error,
288 context=context,
289 iterate_collection=True,
290 )
292 # Note: don't use isinstance(), as it would match subclasses
293 # (e.g. namedtuple, defaultdict) which however would revert to the base class on a
294 # round-trip through msgpack
295 if iterate_collection is None and type(x) in (list, set, tuple, dict):
File [/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py:391](http://localhost:8888/lab/tree/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py#line=390), in serialize(x, serializers, on_error, context, iterate_collection)
389 str_x = str(x)[:10000]
390 except Exception:
--> 391 raise TypeError(msg) from exc
392 raise TypeError(msg, str_x) from exc
393 else: # pragma: nocover
TypeError: Could not serialize object of type _HLGExprSequence
Loading with odc-stac works fine
from pystac.client import Client as StacClient
from odc.stac import load
from dask.distributed import Client
from planetary_computer import sign_url
catalog = StacClient.open("https://planetarycomputer.microsoft.com/api/stac/v1")
items = catalog.search(
collections=["landsat-c2-l2"],
max_items=1
).item_collection()
client = Client(processes=False)
data = load(
items=items,
measurements=["red"],
output_crs="EPSG:4326",
resolution=0.0001,
dask_chunks={"time": 1, "longitude": 1000, "latitude": 1000},
patch_url=sign_url,
)
data.compute()
Loading using the driver="rio" argument for odc-load works. So this is a viable workaround for now, but it would be good to resolve the underlying issue.
from datacube import Datacube
from dask.distributed import Client
import os
from odc.stac import configure_s3_access
os.environ["AWS_DEFAULT_REGION"] = "us-west-2"
if "AWS_NO_SIGN_REQUEST" in os.environ:
del os.environ["AWS_NO_SIGN_REQUEST"]
configure_s3_access(requester_pays=True)
dc = Datacube(app="test")
client = Client(processes=False)
datasets = dc.find_datasets(
product="ls9_c2l2_sr",
limit=1
)
data = dc.load(
datasets=datasets,
measurements=["red"],
output_crs="EPSG:4326",
resolution=0.0001,
dask_chunks={"time": 1, "longitude": 1000, "latitude": 1000},
driver="rio"
)
data.compute()
Ok, so a final test. Same environment as before, this time using the postgres index driver, rather than postgis, and this works (note, no driver="rio"):
from datacube import Datacube
from dask.distributed import Client
import os
from datacube.utils.aws import configure_s3_access
os.environ["AWS_DEFAULT_REGION"] = "us-west-2"
if "AWS_NO_SIGN_REQUEST" in os.environ:
del os.environ["AWS_NO_SIGN_REQUEST"]
configure_s3_access(requester_pays=True)
dc = Datacube(app="test")
client = Client(processes=False)
datasets = dc.find_datasets(
product="ls9_c2l2_sr",
limit=1
)
data = dc.load(
datasets=datasets,
measurements=["red"],
output_crs="EPSG:4326",
resolution=0.0001,
dask_chunks={"time": 1, "longitude": 1000, "latitude": 1000},
)
data.compute()
Some more context, postgres vs postgis dataset metadata_doc content
I'm not sure how to read the backtraces, but one of the early ones include Mapper, and some lambda. I changed things around with the mapper in April, and I'm not surprised that lambdas can't be pickled.
Has this worked with the postgis driver on some older 1.9 release for you, so it is a regression?
I changed things around with the mapper in April
What did you change?
Has this worked with the postgis driver on some older 1.9 release for you
I don't know, to be honest. But I'm now 99% sure it's something to do with the PostGIS Index Driver, which @SpacemanPaul says doesn't make any sense at all!
You seem to be able to reproduce it reliably across multiple environments, so there must be something going on either with your setups or ODC. And keep in mind that I don't know anything about this issue/area, I only saw a keyword I recognized and commented.
The change I was thinking about is #1831, but I also did #1801 to get better type checking. It's possible that your problem was present before my change though, that was why I asked if you had been able to do this before.
As a general comment, when having (sketch, it's midnight and I don't know Python pools):
failing = factory(x)
pool.submit(f, failing)
where failing can't be pickled, it might be possible to put that call inside the body of f instead to avoid the problematic pickling.
I don't know what it is that can't be picked still!
In summary, here's what I've tested:
- Many combinations of dask/xarray, nothing changes
- Using the
riovsdefaultdriver as an arguent todc.load(), therioloader works - Using the
postgresvspostgisindex, andpostgresworks for bothrioanddefaultdrivers
So, in short, the non-rio driver fails to pickle something when the index backend is postgis.
This PR includes a bunch of Mapped data structures: https://github.com/opendatacube/datacube-core/pull/1801/files
I don't know if that's any relation to the 'Mapper.__init__.<locals>.<lambda>' in the above stack traces?
I know that the Mapper comes from SQLAlchemy (https://docs.sqlalchemy.org/en/20/orm/mapper_config.html), I don't know exactly where it resides in a running application.
We're talking about databases and distributing datacube-core though, and I know very little about both those subjects, so someone who knows more will have to chime in on this one.