Co-DETR The user-unfriendly implementation of CoDINO head

This code makes the inference format of output become a class, rather than a list of tensors, which may cause errors when trying to convert the model to onnx, even if mmdeploy is used. Why use this output format?

Jul 15 '25 13:07 kasteric

Sorry for the inconvenience. Our work is based on the mmdetection framework and follows mmdet's implementation. I acknowledge that this approach may have some issues. As I'm currently focused on other projects and don't have much time to maintain this repository, we welcome community contributions through pull requests to help improve these features.

Jul 15 '25 17:07 TempleX98

Sorry for the inconvenience. Our work is based on the mmdetection framework and follows mmdet's implementation. I acknowledge that this approach may have some issues. As I'm currently focused on other projects and don't have much time to maintain this repository, we welcome community contributions through pull requests to help improve these features.

There was some misconceptions. Since I have ever managed to convert some other models to onnx without errors, then I thought it was CoDINO's attempt to use InstanceData() as output format. But then I started to realize other rcnn-based detection models also used InstanceData as output format for normal inference, the difference was that mmdeploy has rewritten the forward function to make sure the forward path does not contain any non-convertable process

Jul 16 '25 02:07 kasteric

Sorry for the inconvenience. Our work is based on the mmdetection framework and follows mmdet's implementation. I acknowledge that this approach may have some issues. As I'm currently focused on other projects and don't have much time to maintain this repository, we welcome community contributions through pull requests to help improve these features.

By the way, in the implementation of predict_feat function of CoDINO head, enc_bbox_preds, enc_outputs are not used, are they duplicate parameters or do they have some special usage. If these are duplicate in-parameters, then this function is exactly the same following DETR head. I think these parameters are only used for training. But your predict function is implemented differently, using self.forward which always returns these two additional parameters regardless of training or evaluation.

Jul 16 '25 07:07 kasteric