intake-esm icon indicating copy to clipboard operation
intake-esm copied to clipboard

deepcopy forgets manual changes to catalog dataframe

Open ollie-bell opened this issue 1 year ago • 3 comments

Description

Using aggregate=False in esm_datastore.to_dataset_dict() triggers a deepcopy of the object. For whatever reason, the deepcopy forgets any manual changes made to the dataframe by updating cat.esmcat._df (e.g. as recommended in the documentation here). I would expect manual changes made to the dataframe to be cascaded through the rest of the object.

What I Did

Replicate the tutorial here: https://intake-esm.readthedocs.io/en/latest/how-to/manipulate-catalog.html

Only change made was to add aggregate=False in the call tocat_subset.to_dataset_dict().

Now all 40 original assets are loaded instead of just the 8 intended assets after cat_subset.esmcat._df was modified.

Version information: output of intake_esm.show_versions()

Paste the output of intake_esm.show_versions() here:

import intake_esm

intake_esm.show_versions()

INSTALLED VERSIONS
------------------

cftime: 1.6.1
dask: 2022.7.1
fastprogress: 1.0.3
fsspec: 2022.7.1
gcsfs: 2022.7.1
intake: 0.6.5
intake_esm: 2021.8.17.post86
netCDF4: 1.6.0
pandas: 1.4.3
requests: 2.28.1
s3fs: 2022.7.1
xarray: 2022.6.0
zarr: 2.12.0

ollie-bell avatar Aug 08 '22 11:08 ollie-bell