intake-esm
intake-esm copied to clipboard
re.compile not interpreted correctly when passed to the _search.py search method
Here's a quick checklist in what to include:
-
[x] Include a detailed description of the bug or suggestion
-
[x] Output of
intake_esm.show_versions() -
[x] Minimal, self-contained copy-pastable example that generates the issue if possible. Please be concise with code posted. See guidelines below on how to provide a good bug report:
Description
I am trying to pass a python re.compile object for one of the column entries in an intake catalog search following the example in the code comments. However, the search method expects values to be iterables in the query dict, and throws an error when trying to resolve the re.compile object.
What I Did
for case_name, case_d in case_dict.items():
path_regex = re.compile(r'({})'.format(case_name)). # Search for the case_name group in the path entries
freq = case_d.varlist.T.frequency
for v in case_d.varlist.iter_vars():
cat_subset = cat.search(activity_id=case_d.convention,
standard_name=v.standard_name,
frequency=freq,
realm=v.realm,
path=path_regex
)
The path_regex object passed to catalog _search.search method:
re.compile('(CMIP_Synthetic_r1i1p1f1_gr1_19800101-19841231)')
path_regex has the following attributes:
- flags (int)
- group_index (dict)
- groups (int)
- pattern (str)
Thus, values.pattern seems like it is what the search method should be using in the for value in values loop if values is an re.compile object
Stack trace
File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 55, in wrapper_function
return vd.call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 150, in call
return self.execute(m)
^^^^^^^^^^^^^^^
File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 222, in execute
return self.raw_function(**d, **var_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/core.py", line 393, in search
esmcat_results = self.esmcat.search(require_all_on=require_all_on, query=query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/cat.py", line 385, in search
results = search(
^^^^^^^
File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/_search.py", line 46, in search
for value in values:
TypeError: 're.Pattern' object is not iterable
Version information: output of intake_esm.show_versions()
Paste the output of intake_esm.show_versions() here:
INSTALLED VERSIONS
------------------
cftime: 1.6.2
dask: 2023.9.1
fastprogress: 1.0.3
fsspec: 2024.2.0
gcsfs: None
intake: 0.7.0
intake_esm: 2024.2.6
netCDF4: 1.6.4
pandas: 2.1.0
requests: 2.31.0
s3fs: None
xarray: 2023.8.0
zarr: 2.16.1
@wrongkindofdoctor - can you try passing it as a list? Sorry for the delayed response here.
ex.
for case_name, case_d in case_dict.items():
path_regex = re.compile(r'({})'.format(case_name)). # Search for the case_name group in the path entries
freq = case_d.varlist.T.frequency
for v in case_d.varlist.iter_vars():
cat_subset = cat.search(activity_id=case_d.convention,
standard_name=v.standard_name,
frequency=freq,
realm=v.realm,
path=[path_regex]
)
@mgrover1 sorry for the late response. I just got around to testing passing the re.compile object as a list to cat.search, and this resolves the issue. Thanks for your help!