intake-esm icon indicating copy to clipboard operation
intake-esm copied to clipboard

re.compile not interpreted correctly when passed to the _search.py search method

Open wrongkindofdoctor opened this issue 1 year ago • 1 comments

Here's a quick checklist in what to include:

  • [x] Include a detailed description of the bug or suggestion

  • [x] Output of intake_esm.show_versions()

  • [x] Minimal, self-contained copy-pastable example that generates the issue if possible. Please be concise with code posted. See guidelines below on how to provide a good bug report:

Description

I am trying to pass a python re.compile object for one of the column entries in an intake catalog search following the example in the code comments. However, the search method expects values to be iterables in the query dict, and throws an error when trying to resolve the re.compile object.

What I Did

   for case_name, case_d in case_dict.items():
        path_regex = re.compile(r'({})'.format(case_name)). # Search for the case_name group in the path entries
        freq = case_d.varlist.T.frequency
        for v in case_d.varlist.iter_vars():
              cat_subset = cat.search(activity_id=case_d.convention,
                                   standard_name=v.standard_name,
                                   frequency=freq,
                                   realm=v.realm,
                                   path=path_regex
                                   )

The path_regex object passed to catalog _search.search method:

re.compile('(CMIP_Synthetic_r1i1p1f1_gr1_19800101-19841231)')

path_regex has the following attributes:

  • flags (int)
  • group_index (dict)
  • groups (int)
  • pattern (str)

Thus, values.pattern seems like it is what the search method should be using in the for value in values loop if values is an re.compile object Stack trace


File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 55, in wrapper_function
    return vd.call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 150, in call
    return self.execute(m)
           ^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/pydantic/deprecated/decorator.py", line 222, in execute
    return self.raw_function(**d, **var_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/core.py", line 393, in search
    esmcat_results = self.esmcat.search(require_all_on=require_all_on, query=query)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/cat.py", line 385, in search
    results = search(
              ^^^^^^^
  File "/Users/j/micromamba/envs/_MDTF_base/lib/python3.11/site-packages/intake_esm/_search.py", line 46, in search
    for value in values:
TypeError: 're.Pattern' object is not iterable

Version information: output of intake_esm.show_versions()

Paste the output of intake_esm.show_versions() here:

INSTALLED VERSIONS
------------------

cftime: 1.6.2
dask: 2023.9.1
fastprogress: 1.0.3
fsspec: 2024.2.0
gcsfs: None
intake: 0.7.0
intake_esm: 2024.2.6
netCDF4: 1.6.4
pandas: 2.1.0
requests: 2.31.0
s3fs: None
xarray: 2023.8.0
zarr: 2.16.1

wrongkindofdoctor avatar Feb 07 '24 17:02 wrongkindofdoctor

@wrongkindofdoctor - can you try passing it as a list? Sorry for the delayed response here.

ex.

   for case_name, case_d in case_dict.items():
        path_regex = re.compile(r'({})'.format(case_name)). # Search for the case_name group in the path entries
        freq = case_d.varlist.T.frequency
        for v in case_d.varlist.iter_vars():
              cat_subset = cat.search(activity_id=case_d.convention,
                                   standard_name=v.standard_name,
                                   frequency=freq,
                                   realm=v.realm,
                                   path=[path_regex]
                                   )

mgrover1 avatar Feb 28 '24 20:02 mgrover1

@mgrover1 sorry for the late response. I just got around to testing passing the re.compile object as a list to cat.search, and this resolves the issue. Thanks for your help!

wrongkindofdoctor avatar Apr 22 '24 21:04 wrongkindofdoctor