unitxt
unitxt copied to clipboard
Add type checking for task definition.
Today, tasks fields have no types
FormTask(
inputs=["text", "text_type", "class"],
outputs={"class" , "label"} ,
metrics=[
"metrics.f1_micro_multi_label",
"metrics.f1_macro_multi_label",
"metrics.accuracy",
],
)
This makes it hard for the user to know how to transform their dataset into the right format . They can mistake pass a list for a string, for example and get odd errors or even just unexpected behavior .
We want to add typing definitions:
FormTask(
inputs={"text":"str", "text_type":"str", "class": "str"},
outputs={"class": "str", "label":"List[str]"} ,
prediction_type="str".
metrics=[
"metrics.f1_micro_multi_label",
"metrics.f1_macro_multi_label",
"metrics.accuracy"
]
)
So the FormTask operator will check that all the fields exist and are of the correct type. It will also check that the prediction_type of the task is compatible with the prediction_type of all the metrics (#667 )