RT-DETR
RT-DETR copied to clipboard
about "Uncertainty-minimal Query Selection"
Star RTDETR 请先在RTDETR主页点击star以支持本项目 Star RTDETR to help more people discover this project.
Describe the bug A clear and concise description of what the bug is. If applicable, add screenshots to help explain your problem.
To Reproduce Steps to reproduce the behavior.
Dear author, thank you for your briliant work! I noticed that the paper mentions a method called 'Uncertainty-minimal Query Selection,' but I couldn't find it in the code provided. Could you please clarify where this method is implemented or if it might be missing
I had a same question after going through the code base. I figured out that it is implemented as vfl_loss in criterion.
I had a same question after going through the code base. I figured out that it is implemented as vfl_loss in criterion.
Could you please show me the related code in vfl_loss? Thank you!
I don't know how to link this so I'm gonna copy paste code.
def loss_labels_vfl(self, outputs, targets, indices, num_boxes, log=True): assert 'pred_boxes' in outputs idx = self._get_src_permutation_idx(indices)
src_boxes = outputs['pred_boxes'][idx]
target_boxes = torch.cat([t['boxes'][i] for t, (_, i) in zip(targets, indices)], dim=0)
ious, _ = box_iou(box_cxcywh_to_xyxy(src_boxes), box_cxcywh_to_xyxy(target_boxes))
ious = torch.diag(ious).detach()
src_logits = outputs['pred_logits']
target_classes_o = torch.cat([t["labels"][J] for t, (_, J) in zip(targets, indices)])
target_classes = torch.full(src_logits.shape[:2], self.num_classes,
dtype=torch.int64, device=src_logits.device)
target_classes[idx] = target_classes_o
target = F.one_hot(target_classes, num_classes=self.num_classes + 1)[..., :-1]
target_score_o = torch.zeros_like(target_classes, dtype=src_logits.dtype)
target_score_o[idx] = ious.to(target_score_o.dtype)
target_score = target_score_o.unsqueeze(-1) * target
pred_score = F.sigmoid(src_logits).detach()
weight = self.alpha * pred_score.pow(self.gamma) * (1 - target) + target_score
loss = F.binary_cross_entropy_with_logits(src_logits, target_score, weight=weight, reduction='none')
loss = loss.mean(1).sum() * src_logits.shape[1] / num_boxes
return {'loss_vfl': loss}
Search for 'rtdetr_criterion.py'
I don't know how to link this so I'm gonna copy paste code.
def loss_labels_vfl(self, outputs, targets, indices, num_boxes, log=True): assert 'pred_boxes' in outputs idx = self._get_src_permutation_idx(indices)
src_boxes = outputs['pred_boxes'][idx] target_boxes = torch.cat([t['boxes'][i] for t, (_, i) in zip(targets, indices)], dim=0) ious, _ = box_iou(box_cxcywh_to_xyxy(src_boxes), box_cxcywh_to_xyxy(target_boxes)) ious = torch.diag(ious).detach() src_logits = outputs['pred_logits'] target_classes_o = torch.cat([t["labels"][J] for t, (_, J) in zip(targets, indices)]) target_classes = torch.full(src_logits.shape[:2], self.num_classes, dtype=torch.int64, device=src_logits.device) target_classes[idx] = target_classes_o target = F.one_hot(target_classes, num_classes=self.num_classes + 1)[..., :-1] target_score_o = torch.zeros_like(target_classes, dtype=src_logits.dtype) target_score_o[idx] = ious.to(target_score_o.dtype) target_score = target_score_o.unsqueeze(-1) * target pred_score = F.sigmoid(src_logits).detach() weight = self.alpha * pred_score.pow(self.gamma) * (1 - target) + target_score loss = F.binary_cross_entropy_with_logits(src_logits, target_score, weight=weight, reduction='none') loss = loss.mean(1).sum() * src_logits.shape[1] / num_boxes return {'loss_vfl': loss}Search for 'rtdetr_criterion.py'
sorry to bother you again. I don't see the related code about the eqution (2) in the paper. I notice that eqution (2) is L2 distance between localization and classification.
I have the same question about "Uncertainty-minimal Query Selection". I think the code you quote may come from VFocalLoss, which is a focal loss mixed with IoU. The raw paper I refer is called "VarifocalNet: an IoU-aware dense object detector", from which I can find the definition of VFL.
$VFL(p,q)=-q[q\log p+(1-q)\log(1-p)]$ when q > 0, $VFL(p,q)=-\alpha p^\gamma\log(1-p)$ when q=0, q is the IoU while p is the predict probability.
I don't know how to link this so I'm gonna copy paste code. def loss_labels_vfl(self, outputs, targets, indices, num_boxes, log=True): assert 'pred_boxes' in outputs idx = self._get_src_permutation_idx(indices)
src_boxes = outputs['pred_boxes'][idx] target_boxes = torch.cat([t['boxes'][i] for t, (_, i) in zip(targets, indices)], dim=0) ious, _ = box_iou(box_cxcywh_to_xyxy(src_boxes), box_cxcywh_to_xyxy(target_boxes)) ious = torch.diag(ious).detach() src_logits = outputs['pred_logits'] target_classes_o = torch.cat([t["labels"][J] for t, (_, J) in zip(targets, indices)]) target_classes = torch.full(src_logits.shape[:2], self.num_classes, dtype=torch.int64, device=src_logits.device) target_classes[idx] = target_classes_o target = F.one_hot(target_classes, num_classes=self.num_classes + 1)[..., :-1] target_score_o = torch.zeros_like(target_classes, dtype=src_logits.dtype) target_score_o[idx] = ious.to(target_score_o.dtype) target_score = target_score_o.unsqueeze(-1) * target pred_score = F.sigmoid(src_logits).detach() weight = self.alpha * pred_score.pow(self.gamma) * (1 - target) + target_score loss = F.binary_cross_entropy_with_logits(src_logits, target_score, weight=weight, reduction='none') loss = loss.mean(1).sum() * src_logits.shape[1] / num_boxes return {'loss_vfl': loss}Search for 'rtdetr_criterion.py'
sorry to bother you again. I don't see the related code about the eqution (2) in the paper. I notice that eqution (2) is L2 distance between localization and classification.