intake-esm icon indicating copy to clipboard operation
intake-esm copied to clipboard

`esm_datastore_v2`: A rewrite of `esm_datastore` with new features + improvements

Open andersy005 opened this issue 2 years ago • 2 comments

TO DO

  • Introduce pydantic models that allow us to validate the catalog against a pre-defined schema. This has the potential of reducing code complexity in the existing esm_datastore catalog object.
    • [x] #347
    • [x] https://github.com/intake/intake-esm/pull/367
  • Public methods to expose (https://github.com/intake/intake-esm/pull/368)
    • [x] __init__
    • [x] __len__
    • [x] __getitem__
    • [x] __contains__
    • [x] to_dataset_dict()
    • [x] nunique()
    • [x] unique()
    • [x] search()
    • [x] https://github.com/intake/intake-esm/pull/370
    • [x] https://github.com/intake/intake-esm/pull/373
    • [x] https://github.com/intake/intake-esm/pull/374
    • [x] https://github.com/intake/intake-esm/pull/375
  • Merge intake_esm.source:ESMGroupDataSource and intake_esm.source:ESMDataSource into a single data source intake_esm.source:ESMDataSource: https://github.com/intake/intake-esm/pull/372
  • #352 proposes creating a dict like object and returning this object as the output of to_dataset_dict(). This object will look and smell like a dictionary but will have additional functionality to facilitate applying operators on all returned datasets and/or a subset of these datasets
  • Add Derived Variable functionality
    • https://github.com/intake/intake-esm/pull/379
    • https://github.com/intake/intake-esm/issues/388
    • https://github.com/intake/intake-esm/issues/387
  • #163 proposes providing users with functionality to control how dataset aggregations are done. Currently, the aggregation process is rigid. A rewrite of esm_datastore will allow us to explore exposing aggregation logic to users via a registry of operators/preprocessors.
  • [ ] Update documentations to reflect new changes

andersy005 avatar Aug 07 '21 00:08 andersy005

@andersy005 what still need to get done here? Are we still missing the docs?

mgrover1 avatar Aug 23 '22 20:08 mgrover1

For this particular issue, there is one pending issue that needs to be addressed

  • https://github.com/intake/intake-esm/issues/511

Regarding the docs, everything looks good. I plan to organize what we currently have in a separate PR.

For the release, here's a list of outstanding issues that i think are worth addressing before the next release: https://github.com/intake/intake-esm/milestone/11

andersy005 avatar Aug 23 '22 22:08 andersy005

@andersy005 Can we close this now that #511 is good to go?

mgrover1 avatar Sep 14 '22 15:09 mgrover1