yolov5 icon indicating copy to clipboard operation
yolov5 copied to clipboard

Inconsistency issue with single_cls functionality and dataset class count

Open Le0v1n opened this issue 9 months ago • 1 comments

Search before asking

  • [X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

I found a small issue related to single_cls that I'm not quite clear on the purpose of.

In train.py, there is the following statement:

names = {0: "item"} if single_cls and len(data_dict["names"]) != 1 else data_dict["names"]  # class names

This statement can be broken down into:

if single_cls and len(data_dict["names"]) != 1:  # The user has enabled --single_cls, but the dataset configuration file has more than one class
    names = {0: "item"}
else:  # The user has not enabled --single_cls or len(data_dict["names"]) == 1
    names = data_dict["names"]

Here, single_cls indicates that the task has only one class; data_dict["names"] are the names of different classes defined in the dataset configuration file; len(dict) is used to determine the number of keys in a dictionary.

I don't understand why len(data_dict["names"]) != 1 is used. In the current code, names = {0: "item" only happens in one case, which is when --single_cls is enabled and the dataset configuration file has multiple classes. Is this case too rare? Suppose the dataset used is MS COCO, which has 80 classes, then after enabling --single_cls, only one class remains. Will the model still train and inference normally in this case?

Also, I suggest adding a warning to avoid misuse by users:

if single_cls and len(data_dict["names"]) != 1:
    LOGGER.warning("WARNING ⚠️ Please check the dataset to ensure that when --single_cls is enabled, the number of classes in the dataset is 1.")

Additional

No response

Le0v1n avatar May 20 '24 08:05 Le0v1n