rf-detr icon indicating copy to clipboard operation
rf-detr copied to clipboard

num_classes issue

Open virgile-fsr opened this issue 7 months ago • 19 comments
trafficstars

https://github.com/roboflow/rf-detr/blob/6ca1b58e43668321c4eeea34135bb231cd30520a/rfdetr/main.py#L92

If user creates a new model with: model = RFDETRBase(num_classes=123) The code linked above will overwrite the num classes specified by the users by the one in the checkpoint (90 by default). I don't think this is expected behavior. More worrisome, the config of the object stays with num_classes=123 which will cause issues in the train function.

If you call: model.train(dataset_dir=...) with the model above, it will check the number of classes in the dataset and compare it with self.model_config.num_classes which is still 123 in our example (even though the actual model has a 90 sized output layer). So if the number of classes in the dataset is indeed 123, the check will pass even though it should not.

A different issue it that if you try a dataset with a single class, the labels from the dataloader will have size zero.

virgile-fsr avatar Mar 25 '25 20:03 virgile-fsr

single class is being discussed here https://github.com/roboflow/rf-detr/issues/48, the issue is due to our working mainly with datasets exported from roboflow which have a filler 0 class causing the issue you're observing with non-roboflow format datasets.

good point re using number of classes as an excuse to reinitialize the head! do you have an idea of what a preferred interface would look like? maybe something like having num_classes default to None and if it is not None, we reinitialize the head with that number of classes?

isaacrob-roboflow avatar Mar 26 '25 19:03 isaacrob-roboflow

I encounter a similar issue, when loading from pretrained weights. I want the model to predict two classes and thus instantiate it with: model = RFDETRLarge(num_classes=2). I get printed:

Loading pretrain weights

WARNING:rfdetr.main:num_classes mismatch: pretrain weights has 90 classes, but your model has 2 classes
reinitializing detection head with 90 classes

It thus trained with 90 classes. Moreover, when predicting with the trained 90-class model model = RFDETRBase(pretrain_weights="checkpoint_best_total.pth", num_classes=2) I have: num_classes mismatch: pretrain weights has 1 classes, but your model has 2 classes.

anyaflya avatar Apr 24 '25 13:04 anyaflya

I have a similar issue. I train with 10 classes and i get a warning during training that says num_classes mismatch: pretrain weights has 90 classes, but your model has 10 classes reinitializing detection head with 2 classes.

Then in prediction i get num_classes mismatch: pretrain weights has 9 classes, but your model has 90 classes reinitializing detection head with 9 classes. Why in prediction it finds one less class? But i get predictions from 0 to 9, so the results are correct.

I dont specify num_classes when defining the model. Which is the correct way? Do you think i will have a problem with this?

panagiotamoraiti avatar Apr 29 '25 22:04 panagiotamoraiti

On my side, I managed to make the training and predictions work with the following:

  1. training: instantiating the model without any arguments, and give the dataset when training.
    model = RFDETRBase()
    model.train(
                dataset_dir=dataset_dir,
                output_dir=output_dir,
            )
    
    In my case, my dataset contains two classes.
  2. predictions: instantiate the model with the checkpoint path and the number of classes.
    model = RFDETRBase(
                    pretrain_weights=str(checkpoint_path), num_classes=num_classes
                )
    model.predict(image, threshold=0.5)
    
    It prints the warning
    WARNING - num_classes mismatch: pretrain weights has 1 classes, but your model has 2 classes
    reinitializing detection head with 1 classes
    
    but the model indeed predicts 2 classes.

There is clearly a bug with the warnings, would be nice to check that out @isaacrob-roboflow.

anyaflya avatar Apr 30 '25 08:04 anyaflya

I've done the same as you suggested. I don't use num_classes argument. During prediction it gives a warning that it initializes the model with one class less than my actual number of classes, but predictions are done for all classes.

panagiotamoraiti avatar Apr 30 '25 11:04 panagiotamoraiti

yeah I think there's a bug in the warnings .. @SkalskiP ?

isaacrob-roboflow avatar May 05 '25 19:05 isaacrob-roboflow

I see two main issues here:

  • The warning message is misleading and suggests that users did something wrong. Since the default value of num_classes is 90, anyone trying to load a fine-tuned model gets the warning: num_classes mismatch: pretrain weights has 10 classes, but your model has 90 classes.

  • Since we're logging checkpoint_num_classes - 1, the reported number of classes in the checkpoint is likely lower than the actual value. This looks like a bug.

@isaacrob-roboflow In general, I believe num_classes should not be a user-configurable argument. It should be inferred either from the checkpoint or the dataset. I don’t see any case where manually setting it should be the user’s responsibility. The comments from @panagiotamoraiti and @anyaflya are the best evidence that num_classes is misleading for users.

  • When training a model, num_classes should match the number of classes in the dataset. Optionally, we can inform the user if the pretrained checkpoint has a different number of classes.

  • When loading a model, num_classes should be set based on the value in the checkpoint.

SkalskiP avatar May 06 '25 09:05 SkalskiP

I don't have strong feelings about this! Although I would like there to be a mechanism to load a model with a set number of classes without having to have a dataset exist. This is useful for people who want to train with custom data pipelines.

@SkalskiP feel free to make a change you think is best as long as it is still possible to create the model without a dataset present :)

isaacrob-roboflow avatar May 06 '25 14:05 isaacrob-roboflow

@isaacrob-roboflow I don’t know anyone using RF-DETR to train with custom data pipelines. Could you give me a bit more context on how and why someone would want to do that?

SkalskiP avatar May 07 '25 10:05 SkalskiP

I mean I have very often built my own data pipeline. useful for implementing certain augmentations, plugging into large datasets that can't be entirely stored locally, plugging into existing data pipelines .. I think it is more likely to be an issue in research cases and more mature enterprise pipelines. also useful for benchmarking for a given use case without committing to having data present in the appropriate format. I definitely think it should be possible to instantiate a model with an arbitrary configuration without relying on having a dataset .. separation of computation and transaction, and all that

isaacrob-roboflow avatar May 07 '25 19:05 isaacrob-roboflow

I have the exact same issue where I fine tuned with my own custom dataset with just 1 class - and it gives a warning that the model has 0 classes when loading the model. I thought I did something wrong and re-checked all the code. but predictions did work as expected. and after reading about this here, it does look like a genuine bug (or at worse a misleading warning that cause unnecessary alarm).

malaccan avatar May 10 '25 13:05 malaccan

I was also confused by this 😅. I have a couple of questions:

  1. In lwdetr.py, the build_model() function sets num_classes=args.num_classes + 1 when constructing the model. Apparently this is carried over from the original DETR model, which adds a no-object class and sets its ID to max_id + 1. Does RF-DETR actually use a no-object class? The pre-trained models have a class_embed size of 91 (whereas COCO has 90 classes). Fine-tuning seems to work without the extra class, but is this how we expect the model to behave?
  2. I agree the warnings are misleading. Because reinitialize_detection_head() is called before load_state_dict(), it doesn't affect inference, but at first I thought my trained weights had been overridden. When loading a model, I expect to call something like RFDetrBase.from_checkpoint(pretrain_weights=...) and have the classes inferred. At the moment, the only way to suppress the warning is to explicitly pass num_classes=<actual num classes minus 1>. Of course, this is assuming the model is not meant to have the no-object class.

cduong-a avatar Jun 22 '25 10:06 cduong-a

yeah I think there's a bug in the warnings .. @SkalskiP ?

Same problem me too

GabrieleGiudic avatar Jul 26 '25 09:07 GabrieleGiudic

i am having a slightly different issue , i am getting ** WARNING - num_classes mismatch: pretrain weights has 365 classes, but your model has 90 classes reinitializing detection head with 365 classes** which makes zero sense since i trained my model on roboflow with only 3 classes initialized for RFDETRMedium , this exact setup im using was previously on RFDETRBase and everything was the same even the training was done on the same dataset in the same workspace for the same amount of classes and it was working fine getting a num_classes mismatch: pretrain weights has 4 classes, but your model has 90 classes reinitializing detection head with 4 classes i realized that "the you model has " is only refering to the num_classes which if you do not set it during initialization it simply defaults to 90 , but where did the 365 classes come from i have no clue , please help me this took 2 days of training

ghost avatar Aug 10 '25 11:08 ghost

I suspect that 365 likely comes from the pretrained weights for RFDETR, which were trained on the Objects365 dataset (365 categories). I found about this dataset in a response in another issue.

panagiotamoraiti avatar Aug 10 '25 11:08 panagiotamoraiti

but the weights i have used are the checkpoint from my specific training ran on roboflow for RFDETRMedium , it should only be 3 classes +1 for background @SkalskiP

ghost avatar Aug 10 '25 11:08 ghost

Yeah there's a bug, it's accidentally keeping the 365 classes from the o365 pretrain. @probicheaux is working on it iirc but he's been ill for a few days

isaacrob-roboflow avatar Aug 12 '25 15:08 isaacrob-roboflow

Feel free to raise a new issue btw so it's easier for us to keep track

isaacrob-roboflow avatar Aug 12 '25 15:08 isaacrob-roboflow

Good insight @panagiotamoraiti ! :)

isaacrob-roboflow avatar Aug 12 '25 15:08 isaacrob-roboflow