intake-esm icon indicating copy to clipboard operation
intake-esm copied to clipboard

deepcopy forgets manual changes to catalog dataframe

Open ollie-bell opened this issue 3 years ago • 3 comments
trafficstars

Description

Using aggregate=False in esm_datastore.to_dataset_dict() triggers a deepcopy of the object. For whatever reason, the deepcopy forgets any manual changes made to the dataframe by updating cat.esmcat._df (e.g. as recommended in the documentation here). I would expect manual changes made to the dataframe to be cascaded through the rest of the object.

What I Did

Replicate the tutorial here: https://intake-esm.readthedocs.io/en/latest/how-to/manipulate-catalog.html

Only change made was to add aggregate=False in the call tocat_subset.to_dataset_dict().

Now all 40 original assets are loaded instead of just the 8 intended assets after cat_subset.esmcat._df was modified.

Version information: output of intake_esm.show_versions()

Paste the output of intake_esm.show_versions() here:

import intake_esm

intake_esm.show_versions()

INSTALLED VERSIONS
------------------

cftime: 1.6.1
dask: 2022.7.1
fastprogress: 1.0.3
fsspec: 2022.7.1
gcsfs: 2022.7.1
intake: 0.6.5
intake_esm: 2021.8.17.post86
netCDF4: 1.6.0
pandas: 1.4.3
requests: 2.28.1
s3fs: 2022.7.1
xarray: 2022.6.0
zarr: 2.12.0

ollie-bell avatar Aug 08 '22 11:08 ollie-bell

I'm also encountering this issue, any chance this has been fixed already?

Timh37 avatar Feb 13 '23 16:02 Timh37

@Timh37, which version of intake-esm are you using?

import intake_esm
intake_esm.show_versions()

andersy005 avatar Feb 13 '23 20:02 andersy005

@andersy005 :

I'm using the following:

image

which I think is the latest?

Timh37 avatar Feb 14 '23 07:02 Timh37