unitxt icon indicating copy to clipboard operation
unitxt copied to clipboard

Artifact.from_dict recursively interprets fields that it shouldn't

Open jezekra1 opened this issue 1 year ago • 2 comments

Artifact.from_dict will try to convert any dictionary to artifact even when the dictionary is a mapping of columns from datasets.

There cannot be a column named type:

import tempfile

import pandas as pd
from unitxt.artifact import Artifact

with tempfile.TemporaryDirectory() as tmpdir:
    filename = f"{tmpdir}/bug.csv"
    df = pd.DataFrame([{"type": f"test_{i}", "row": i} for i in range(10)])
    df.to_csv(filename, index=False)
    loader = Artifact.from_dict(
        {
            "type": "sequential_recipe",
            "steps": [
                {"type": "load_csv", "files": {"test": filename}},
                {"type": "rename_fields", "field_to_field": {"row": "myrow", "type": "bug_found"}},
            ],
        }
    )
    print(list(loader()["test"]))

This will result in the following exception:

  File "/Users/radek/Library/Caches/pypoetry/virtualenvs/fmaas-eval-_3vZ4Wue-py3.11/lib/python3.11/site-packages/unitxt/artifact.py", line 214, in _recursive_load
    cls.verify_artifact_dict(obj)
  File "/Users/radek/Library/Caches/pypoetry/virtualenvs/fmaas-eval-_3vZ4Wue-py3.11/lib/python3.11/site-packages/unitxt/artifact.py", line 149, in verify_artifact_dict
    raise UnrecognizedArtifactTypeError(d["type"])
unitxt.artifact.UnrecognizedArtifactTypeError: 'bug_found' is not a recognized artifact 'type'. Make sure a the class defined this type (Probably called 'BugFound' or similar) is defined and/or imported anywhere in the code executed.

The code works when you remove "type": "bug_found" from field_to_field.

jezekra1 avatar Jun 17 '24 09:06 jezekra1

Yes. Type is currently a reserved name. Maybe we need to change it to __ type __ ( @elronbandel )?

yoavkatz avatar Jun 17 '24 10:06 yoavkatz

I thought we did it already. of course we should. We can also define in the recursion not to get into dictionary without __type__

elronbandel avatar Jun 17 '24 19:06 elronbandel