datumaro
datumaro copied to clipboard
Support metainfo in datasets
Datasets can have metainfo, which is related to the whole dataset, or it's parts. It is not the annotations in the regular meaning, however. Such metainfo can be related to the whole dataset, specific annotations, categories. Currently, some metainfo can be included in DatasetItem
's and Annotation
's attributes
, but it doesn't cover other cases.
Examples:
- COCO (the "info" section)
- VOC, LabelMe (some fields in an item's XML)
- CVAT, Supervisely project (task and project info)
Additional questions:
- If there are multiple extractors providing metainfo in the dataset, how they should be merged?
- Should attributes and metainfo be split into different fields?
Proposed implementation:
- Add
meta
field inIExtractor
/IDataset
-
meta
is a dictionary with string keys and numbers / strings / dict keys (i.e. can be recursive)