datumaro
datumaro copied to clipboard
Support attribute specifications for annotations
Currently, all annotations can have arbitrary attributes, but there is no specifications for these attributes. It would be great, if Datumaro provided:
- A list of possible attribute values (at least, existing in a dataset)
- A type for each attribute (categorical or contiguous/numerical)
- A data type for each attribute
Most dataset formats do not include such information, however. As the first step, we could:
- consider all attributes categorical. Try to recognize, if an attribute is numerical
- infer value ranges from values in input data