unitxt icon indicating copy to clipboard operation
unitxt copied to clipboard

Add type checking for task definition.

Open yoavkatz opened this issue 1 year ago • 0 comments

Today, tasks fields have no types

FormTask(
inputs=["text", "text_type", "class"],
outputs={"class" , "label"} ,
metrics=[
    "metrics.f1_micro_multi_label", 
    "metrics.f1_macro_multi_label",  
    "metrics.accuracy",
],
 )

This makes it hard for the user to know how to transform their dataset into the right format . They can mistake pass a list for a string, for example and get odd errors or even just unexpected behavior .

We want to add typing definitions:

FormTask(
     inputs={"text":"str", "text_type":"str", "class": "str"},
     outputs={"class": "str", "label":"List[str]"} ,
     prediction_type="str". 
     metrics=[
         "metrics.f1_micro_multi_label",   
         "metrics.f1_macro_multi_label", 
          "metrics.accuracy"
        ]
)

So the FormTask operator will check that all the fields exist and are of the correct type. It will also check that the prediction_type of the task is compatible with the prediction_type of all the metrics (#667 )

yoavkatz avatar Mar 20 '24 12:03 yoavkatz