kedro
kedro copied to clipboard
Improve error message when `_` convention is not used in `catalog.yml`
Description
https://linen-slack.kedro.org/t/16334283/hi-all-i-m-looking-into-kedro-0-19-2-and-i-created-a-project#84a17399-4031-4976-93d4-11c71d5a465a
The error look like this: AttributeError: 'str' object has no attribute 'items'
William Caicedo 1 day ago Hi all, I’m looking into Kedro 0.19.2 and I created a project with Spark support. I have just one spark.SparkDataset in my catalog and I’m getting this error:
File "/opt/conda/envs/personas/lib/python3.10/site-packages/kedro/io/data_catalog.py", line 83, in _resolve_credentials return {k: _map_value(k, v) for k, v in config.items()} AttributeError: 'str' object has no attribute 'items'I normally don’t use credentials but in the past this wasn’t an issue. Any ideas what I’m doing wrong?
Context
When using OmegaConfigLoader, user need to use {_xxxx} underscore convention to use template variable. It produces an obscure error when user forgot to do so, it creates confusion particularly when they try to upgrade from older version of Kedro.
Steps to Reproduce
- kedro new with an empty project
- catalog.yml as follow
dataset1:
type: spark.SparkDataset
metadata: ${exhibitor}
filepath: ""
dataset2:
type: spark.SparkDataset
metadata: ${exhibitor}
filepath: ""
exhibitor: abc
ipython - then%load_ext kedro.ipython` - it should raise an error when catalog is initialised.
Expected Result
Actual Result
-- If you received an error, place it here.
-- Separate them if you have more than one.
Your Environment
- Kedro version used (
pip show kedroorkedro -V): - Python version used (
python -V): - Operating system and version:
For context, there's two situations -
- Entries in the catalog that are not
dicteg -
my_var: "whatever"
It fails with this unhelpful error message because we assume the catalog entries are dicts.
We could check for this at the config loader stage and/or display a more helpful error message.
- For catalog entries that are
dict-
my_var:
name: "whatever"
The error message is still a bit more informative but could be improved -
kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'my_var':
'type' is missing from dataset catalog configuration
Currently mentioned in the catalog docs: https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#catalog
Ideally the traceback in both cases from https://github.com/kedro-org/kedro/issues/3555#issuecomment-1914631744 should be
kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'my_var':
'type' is missing from dataset catalog configuration.
Did you mean to define a template variable? If so, prefix it with `_` as explained in https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#catalog
However, it was raised that template values are a more advanced configuration, and also that performing this validation might be tricky.
Close this in favour of https://github.com/kedro-org/kedro/issues/3910?