autolabel icon indicating copy to clipboard operation
autolabel copied to clipboard

Validate the labeling config being sent to the Labeling Agent

Open rajasbansal opened this issue 1 year ago • 2 comments

Currently, the labeling config sent to the Labeling Agent can contain typos in the passed in keys which would lead to the default value being used without informing the user.

We should validate all the keys being passed in the config, and check if they are actual keys. Any unused keys should be flagged by the labeling agent.

For eg :- If there is a typo while filling in the explanation_column, and we call it "expanatn_colums", we should get an error from the labeling agnet saying key not found

rajasbansal avatar Jun 02 '23 22:06 rajasbansal

@rajasbansal Do you have a schema validation tool in mind or Pydantic works?

Sardhendu avatar Jun 21 '23 23:06 Sardhendu

@nihit - any thoughts on a schema validation library to use? @Sardhendu is interested in taking on this issue.

rishabh-bhargava avatar Jun 22 '23 17:06 rishabh-bhargava

Either jsonschema validator or other works. I like pydantic because its pythonic and due to its data model syntax. Additionally post and pre init checks a good to have.

Others I found are Cerberus and Voluptuous but never used them.

Sardhendu avatar Jun 22 '23 18:06 Sardhendu

Hey @Sardhendu, appreciate you taking this on! super important to improve usability and reliability of the library.

Currently the base config class has a dummy _validate function here: https://github.com/refuel-ai/autolabel/blob/main/src/autolabel/configs/base.py#L30, that we'll need to override in the downstream class (https://github.com/refuel-ai/autolabel/blob/main/src/autolabel/configs/config.py).

We can use jsonschema validation for this - define the expected schema for the config, and use something like https://python-jsonschema.readthedocs.io/en/latest/validate/ for validation at runtime when the config object is passed in?

nihit avatar Jun 22 '23 22:06 nihit

@nihit Sure let me take a look.

Sardhendu avatar Jun 23 '23 00:06 Sardhendu