[Feature] Let ConcatDataset use dataset names rather than dataset_idx as the prefix of metrics for multiple validation datasets
Describe the feature
Currently, ConcatDataset uses dataset_idx as the prefix for each metric name, so the metric names will be like 0_accuracy, 1_accuracy, etc.
https://github.com/open-mmlab/mmclassification/blob/90254a845540650bd03bc2b6108251825935c687/mmcls/datasets/dataset_wrappers.py#L97
It would be great if we can set the dataset names like dataset_A_accuracy, dataset_B_accuracy, etc.
dataset_A_val = dict()
dataset_B_val = dict()
# Current interface
data = dict(
val=dict(
type="ConcatDataset",
datasets=[dataset_A_val, dataset_B_val],
),
)
# Suggested interface 1: dataset names as keys of dict
data = dict(
val=dict(
type="ConcatDataset",
datasets=dict(dataset_A=dataset_A_val, dataset_B=dataset_B_val),
),
)
# Suggested interface 2: dataset names in the first element of tuple in list
data = dict(
val=dict(
type="ConcatDataset",
datasets=[("dataset_A", dataset_A_val), ("dataset_B", dataset_B_val)],
),
)
# Suggested interface 3: additional list for dataset names
data = dict(
val=dict(
type="ConcatDataset",
datasets=[dataset_A_val, dataset_B_val],
dataset_names=["dataset_A", "dataset_B"],
),
)
Motivation
When we check the metrics later, it is not intuitive (especially when we show the performance to stakeholders).
Additional context
ConcatDataset is not mentioned in the document of MMClassificaiton, but is explained in the document of MMDetection at: https://mmdetection.readthedocs.io/en/latest/tutorials/customize_dataset.html#concatenate-dataset
Thank you for your suggestion, it is a good idea. We have put this feature in the schedule.
We have added this feature into our roadmap and will support it in 2023. This issue will be closed as it is inactive, feel free to re-open it if necessary.