kedro icon indicating copy to clipboard operation
kedro copied to clipboard

Improve error message when `_` convention is not used in `catalog.yml`

Open noklam opened this issue 1 year ago • 2 comments

Description

https://linen-slack.kedro.org/t/16334283/hi-all-i-m-looking-into-kedro-0-19-2-and-i-created-a-project#84a17399-4031-4976-93d4-11c71d5a465a

The error look like this: AttributeError: 'str' object has no attribute 'items'

William Caicedo 1 day ago Hi all, I’m looking into Kedro 0.19.2 and I created a project with Spark support. I have just one spark.SparkDataset in my catalog and I’m getting this error:

File "/opt/conda/envs/personas/lib/python3.10/site-packages/kedro/io/data_catalog.py", line 83, in _resolve_credentials
    return {k: _map_value(k, v) for k, v in config.items()}
AttributeError: 'str' object has no attribute 'items'

I normally don’t use credentials but in the past this wasn’t an issue. Any ideas what I’m doing wrong?

Context

When using OmegaConfigLoader, user need to use {_xxxx} underscore convention to use template variable. It produces an obscure error when user forgot to do so, it creates confusion particularly when they try to upgrade from older version of Kedro.

Steps to Reproduce

  1. kedro new with an empty project
  2. catalog.yml as follow
dataset1:
  type: spark.SparkDataset
  metadata: ${exhibitor}
  filepath: ""

dataset2:
  type: spark.SparkDataset
  metadata: ${exhibitor}
  filepath: ""

exhibitor: abc
  1. ipython - then %load_ext kedro.ipython` - it should raise an error when catalog is initialised.

Expected Result

Actual Result

-- If you received an error, place it here.
-- Separate them if you have more than one.

Your Environment

  • Kedro version used (pip show kedro or kedro -V):
  • Python version used (python -V):
  • Operating system and version:

noklam avatar Jan 25 '24 15:01 noklam

For context, there's two situations -

  • Entries in the catalog that are not dict eg -
my_var: "whatever"

It fails with this unhelpful error message because we assume the catalog entries are dicts.

We could check for this at the config loader stage and/or display a more helpful error message.

  • For catalog entries that are dict -
my_var:
  name: "whatever"

The error message is still a bit more informative but could be improved -

kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'my_var':
'type' is missing from dataset catalog configuration

ankatiyar avatar Jan 29 '24 12:01 ankatiyar

Currently mentioned in the catalog docs: https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#catalog

Ideally the traceback in both cases from https://github.com/kedro-org/kedro/issues/3555#issuecomment-1914631744 should be

kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'my_var':
'type' is missing from dataset catalog configuration.
Did you mean to define a template variable? If so, prefix it with `_` as explained in https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#catalog

However, it was raised that template values are a more advanced configuration, and also that performing this validation might be tricky.

astrojuanlu avatar Jan 29 '24 14:01 astrojuanlu

Close this in favour of https://github.com/kedro-org/kedro/issues/3910?

ankatiyar avatar Jun 10 '24 14:06 ankatiyar