rf-detr
rf-detr copied to clipboard
train_from_config initializes one less class_embed neuron when training on custom 1-indexed dataset.
Search before asking
- [x] I have searched the RF-DETR issues and found no similar bug report.
Bug
Similar to https://github.com/roboflow/rf-detr/issues/51
When trying to train on a 1-indexed COCO dataset, training fails.
train_from_config() has a bug.
The self.model.reinitialize_detection_head(num_classes) call should have been self.model.reinitialize_detection_head(num_classes + 1)` instead.
I think this originates as the name num_classes is quite misleading.
https://github.com/roboflow/rf-detr/blob/24ce179cb5d71d9049724c9f3bc25b506d9f42a4/rfdetr/models/lwdetr.py#L597
My references: The original DETR's intent of doing num_classes + 1 is explained very well already. https://github.com/facebookresearch/detr/issues/108#issuecomment-674854977 https://github.com/facebookresearch/detr/issues/108#issuecomment-650269223
Proposed Solution: The following solution works for both 0-indexed and 1-indexed COCO datasets: Even works with the original COCO datasets with "holes" in labels.
def train_from_config(self, config: TrainConfig, **kwargs):
with open(
os.path.join(config.dataset_dir, "train", "_annotations.coco.json"), "r"
) as f:
anns = json.load(f)
# num_classes = len(anns["categories"])
num_classes = max([category['id'] for category in anns["categories"]])
class_names = [c["name"] for c in anns["categories"] if c["supercategory"] != "none"]
self.model.class_names = class_names
if self.model_config.num_classes != num_classes:
logger.warning(
f"num_classes mismatch: model has {self.model_config.num_classes} classes, but your dataset has {num_classes} classes\n"
f"reinitializing your detection head with {num_classes} classes."
)
self.model.reinitialize_detection_head(num_classes + 1)
Note that when we initially define self.class_embed in LWDETR we actually pass:
https://github.com/roboflow/rf-detr/blob/24ce179cb5d71d9049724c9f3bc25b506d9f42a4/rfdetr/models/lwdetr.py#L605
But when we call train_from_config when don't do the same num_classes + 1.
When we call self.model.reinitialize_detection_head(num_classes + 1) and we have a 0-indexed dataset, the output of the 0th neuron represents the class_id 0 of the dataset. The last (num_classes + 1)th neuron represents the last actual class_id we have.
For example, class labels can be [0,1,2,3]. num_classes = max_id = 3. We initialize self.class_embed = nn.Linear(hidden_dim, 4) .
Similarly, when we have a 1-indexed dataset, the 0th neuron is now a dummy. The output of n-th neuron corresponds to the class_id = n, from the dataset, just as we expect.
For example, class labels can be [1,2,3,4]. num_classes = max_id = 4. We initialize self.class_embed = nn.Linear(hidden_dim, 5) .
Environment
- RF-DETR 1.2.1
- pytorch 2.8.0
- Python 3.9
Minimal Reproducible Example
Download any 1-indexed COCO dataset and try training using:
from rfdetr import RFDETRMedium
model = RFDETRMedium()
model.train(
dataset_dir=<path-to-coco>,,
epochs=1,
batch_size=2,
grad_accum_steps=1,
lr=1e-4,
num_workers=4,
device='cuda',
)
Additional
No response
Are you willing to submit a PR?
- [x] Yes, I'd like to help by submitting a PR!
I ran into the same bug, found the same fix. Was going to report it but found this thread.
@probicheaux tagging in case you want to think about head reinitialization a bit more. This argument makes sense to me.
@isaacrob-roboflow is the proposed change above acceptable? I again ran into the same problem when trying to train the segmentation model on a COCO dataset, starting with class index 1. These two lines got me moving again:
num_classes = max([category['id'] for category in anns["categories"]])
self.model.reinitialize_detection_head(num_classes + 1)
I was able to train a RFDETRSegPreview and call predict. Testing with export now. So far, everything seems to work.
@probicheaux thoughts?
@isaacrob-roboflow I am closing this issue. I think the current implementation of initializing #num_classes heads is correct.
The proposed solution in this issue creates more problems downstream.
With the proposed solution in this issue, the trained models output 1 for our categoryA of our dataset.
Our other models output 0 for the same category. This creates problems in production and confusion among devs. We tried remapping the dataset to be 0-category-indexed to work with RF-DETR, but that again creates more issues with dataset management.
The best solution I found was from Detectron2. We added that to RF-DETR and it works great. It remaps original dataset categories that are in the range [1, #num_classes] to [0, #num_classes-1] during build_dataset.
During evaluation, it then reverts the category indexes to [1, #num_classes], as cocoapi still expects 1-indexed category ids. This allowed us not to treat RF-DETR specially or change our datasets. Please consider adding that to RF-DETR. It will help others.
https://github.com/facebookresearch/detectron2/blob/a9c0821a12ad353fb2a96f019515990d5460c5ac/detectron2/data/datasets/coco.py#L102
https://github.com/facebookresearch/detectron2/blob/a9c0821a12ad353fb2a96f019515990d5460c5ac/detectron2/evaluation/coco_evaluation.py#L237