kedro icon indicating copy to clipboard operation
kedro copied to clipboard

[DataCatalog]: Improve the way to access namespaced datasets with `_FrozenDataset` API

Open ElenaKhaustova opened this issue 8 months ago • 1 comments

Description

Users struggle with the _FrozenDataset's API when accessing namespaced datasets because it uses double underscores instead of dots, which they find unintuitive and cumbersome. Some prefer referring to the dataset by its original name, so they use the private _get_dataset() method instead.

We propose to:

  1. Explore the feasibility of modifying the _FrozenDataset's API to use dots instead of double underscores for namespaces, aligning with users' expectations.
  2. Provide an opportunity to call datasets by their exact names - get dataset by name function.

Relates to https://github.com/kedro-org/kedro/issues/3926

Context

User feedback:

  • C1 team finds the replacement of characters like “.” or “@” with “__” in dataset names to be unclean and prefers calling datasets by their exact names.
  • "There's a problem with names space data sets, but that's not the end of the world because you can do getattr(catalog.datasets, "namespace.dataset"). So it's still possible but it's just like horribly ugly. Also, it just doesn't feel natural, to me, the catalog feels like it should operate as a dictionary. And therefore access should be by get_item rather than get_attribute. "
  • "The use of double underscores instead of dots for namespaces in the catalog might be unintuitive for users, making the functionality feel awkward and overly complex. This could lead to difficulties in navigating and using namespaces effectively."

ElenaKhaustova avatar Jun 05 '24 12:06 ElenaKhaustova