intake-esm
intake-esm copied to clipboard
deepcopy forgets manual changes to catalog dataframe
Description
Using aggregate=False in esm_datastore.to_dataset_dict() triggers a deepcopy of the object. For whatever reason, the deepcopy forgets any manual changes made to the dataframe by updating cat.esmcat._df (e.g. as recommended in the documentation here). I would expect manual changes made to the dataframe to be cascaded through the rest of the object.
What I Did
Replicate the tutorial here: https://intake-esm.readthedocs.io/en/latest/how-to/manipulate-catalog.html
Only change made was to add aggregate=False in the call tocat_subset.to_dataset_dict().
Now all 40 original assets are loaded instead of just the 8 intended assets after cat_subset.esmcat._df was modified.
Version information: output of intake_esm.show_versions()
Paste the output of intake_esm.show_versions() here:
import intake_esm
intake_esm.show_versions()
INSTALLED VERSIONS
------------------
cftime: 1.6.1
dask: 2022.7.1
fastprogress: 1.0.3
fsspec: 2022.7.1
gcsfs: 2022.7.1
intake: 0.6.5
intake_esm: 2021.8.17.post86
netCDF4: 1.6.0
pandas: 1.4.3
requests: 2.28.1
s3fs: 2022.7.1
xarray: 2022.6.0
zarr: 2.12.0
I'm also encountering this issue, any chance this has been fixed already?
@Timh37, which version of intake-esm are you using?
import intake_esm
intake_esm.show_versions()
@andersy005 :
I'm using the following:

which I think is the latest?