rf-detr icon indicating copy to clipboard operation
rf-detr copied to clipboard

train_from_config initializes one less class_embed neuron when training on custom 1-indexed dataset.

Open Abdul-Mukit opened this issue 3 months ago • 4 comments
trafficstars

Search before asking

  • [x] I have searched the RF-DETR issues and found no similar bug report.

Bug

Similar to https://github.com/roboflow/rf-detr/issues/51
When trying to train on a 1-indexed COCO dataset, training fails.

train_from_config() has a bug.

The self.model.reinitialize_detection_head(num_classes) call should have been self.model.reinitialize_detection_head(num_classes + 1)` instead.
I think this originates as the name num_classes is quite misleading. https://github.com/roboflow/rf-detr/blob/24ce179cb5d71d9049724c9f3bc25b506d9f42a4/rfdetr/models/lwdetr.py#L597

My references: The original DETR's intent of doing num_classes + 1 is explained very well already. https://github.com/facebookresearch/detr/issues/108#issuecomment-674854977 https://github.com/facebookresearch/detr/issues/108#issuecomment-650269223

Proposed Solution: The following solution works for both 0-indexed and 1-indexed COCO datasets: Even works with the original COCO datasets with "holes" in labels.

    def train_from_config(self, config: TrainConfig, **kwargs):
        with open(
            os.path.join(config.dataset_dir, "train", "_annotations.coco.json"), "r"
        ) as f:
            anns = json.load(f)
            # num_classes = len(anns["categories"])
            num_classes = max([category['id'] for category in anns["categories"]])
            class_names = [c["name"] for c in anns["categories"] if c["supercategory"] != "none"]
            self.model.class_names = class_names

        if self.model_config.num_classes != num_classes:
            logger.warning(
                f"num_classes mismatch: model has {self.model_config.num_classes} classes, but your dataset has {num_classes} classes\n"
                f"reinitializing your detection head with {num_classes} classes."
            )
            self.model.reinitialize_detection_head(num_classes + 1)

Note that when we initially define self.class_embed in LWDETR we actually pass: https://github.com/roboflow/rf-detr/blob/24ce179cb5d71d9049724c9f3bc25b506d9f42a4/rfdetr/models/lwdetr.py#L605

But when we call train_from_config when don't do the same num_classes + 1.

When we call self.model.reinitialize_detection_head(num_classes + 1) and we have a 0-indexed dataset, the output of the 0th neuron represents the class_id 0 of the dataset. The last (num_classes + 1)th neuron represents the last actual class_id we have. For example, class labels can be [0,1,2,3]. num_classes = max_id = 3. We initialize self.class_embed = nn.Linear(hidden_dim, 4) .

Similarly, when we have a 1-indexed dataset, the 0th neuron is now a dummy. The output of n-th neuron corresponds to the class_id = n, from the dataset, just as we expect.
For example, class labels can be [1,2,3,4]. num_classes = max_id = 4. We initialize self.class_embed = nn.Linear(hidden_dim, 5) .

Environment

  • RF-DETR 1.2.1
  • pytorch 2.8.0
  • Python 3.9

Minimal Reproducible Example

Download any 1-indexed COCO dataset and try training using:

from rfdetr import RFDETRMedium

model = RFDETRMedium()

model.train(
    dataset_dir=<path-to-coco>,,
    epochs=1,
    batch_size=2,
    grad_accum_steps=1,
    lr=1e-4,
    num_workers=4,
    device='cuda',
)

Additional

No response

Are you willing to submit a PR?

  • [x] Yes, I'd like to help by submitting a PR!

Abdul-Mukit avatar Aug 20 '25 21:08 Abdul-Mukit

I ran into the same bug, found the same fix. Was going to report it but found this thread.

rsinghBFC avatar Sep 08 '25 18:09 rsinghBFC

@probicheaux tagging in case you want to think about head reinitialization a bit more. This argument makes sense to me.

isaacrob-roboflow avatar Sep 09 '25 14:09 isaacrob-roboflow

@isaacrob-roboflow is the proposed change above acceptable? I again ran into the same problem when trying to train the segmentation model on a COCO dataset, starting with class index 1. These two lines got me moving again:

num_classes = max([category['id'] for category in anns["categories"]])
self.model.reinitialize_detection_head(num_classes + 1)

I was able to train a RFDETRSegPreview and call predict. Testing with export now. So far, everything seems to work.

Abdul-Mukit avatar Oct 08 '25 15:10 Abdul-Mukit

@probicheaux thoughts?

isaacrob-roboflow avatar Oct 08 '25 18:10 isaacrob-roboflow

@isaacrob-roboflow I am closing this issue. I think the current implementation of initializing #num_classes heads is correct.

The proposed solution in this issue creates more problems downstream.
With the proposed solution in this issue, the trained models output 1 for our categoryA of our dataset.
Our other models output 0 for the same category. This creates problems in production and confusion among devs. We tried remapping the dataset to be 0-category-indexed to work with RF-DETR, but that again creates more issues with dataset management.

The best solution I found was from Detectron2. We added that to RF-DETR and it works great. It remaps original dataset categories that are in the range [1, #num_classes] to [0, #num_classes-1] during build_dataset.
During evaluation, it then reverts the category indexes to [1, #num_classes], as cocoapi still expects 1-indexed category ids. This allowed us not to treat RF-DETR specially or change our datasets. Please consider adding that to RF-DETR. It will help others.

https://github.com/facebookresearch/detectron2/blob/a9c0821a12ad353fb2a96f019515990d5460c5ac/detectron2/data/datasets/coco.py#L102

https://github.com/facebookresearch/detectron2/blob/a9c0821a12ad353fb2a96f019515990d5460c5ac/detectron2/evaluation/coco_evaluation.py#L237

Abdul-Mukit avatar Nov 06 '25 15:11 Abdul-Mukit