DAB-DETR
DAB-DETR copied to clipboard
The meaning of num_feature_levels.
Thanks for your great and detailed code!
In dab_deformable_detr.py, is this part about dealing with the insufficient channel of features (for Deformable DETR)?
https://github.com/IDEA-opensource/DAB-DETR/blob/309f6ad92af7a62d7732c1bdf1e0c7a69a7bdaef/models/dab_deformable_detr/dab_deformable_detr.py#L169-L181
It seems that the last channel of features is put to backbone[1] repeatedly. (channel? or something else?)
Could you please help me understand that?
Yes, it is. By default, num_feature_levels=4 but 3 different scales are extracted from the backbone. Hence it will 2x upsample the highest feature map(C4) as the 4th feature level.
Thanks!
Here come another confusing lines, especially Line#392:
https://github.com/IDEA-opensource/DAB-DETR/blob/309f6ad92af7a62d7732c1bdf1e0c7a69a7bdaef/models/dab_deformable_detr/deformable_transformer.py#L390-L392
Why concat src_valid_ratios with itself, and then multiplied to reference_points?
When I alter the param num_feature_levels to another one, like 3, the Exceptions of mismatched dim for matrix mul(or ele-wise mul) happens.
torch.cat([src_valid_ratios, src_valid_ratios], -1)[:, None],shape: [bs, 1, num_feature_levels, 4];reference_points[:, :, None].shape: [num_queries, 4, 1];
By default, num_feature_levels is 4, so the mul operation is performed "well". Is it a bug?
Besides, I noticed that there is no batch_size dimension in reference_points. Does that mean various batches share the same reference_points?