Shilong Liu

Results 90 comments of Shilong Liu

Yes, it is. By default, ```num_feature_levels=4``` but ```3``` different scales are extracted from the backbone. Hence it will 2x upsample the highest feature map(C4) as the 4th feature level.

Thanks for your suggestions, we will clean this part and update them recently.

@fernandorovai Thanks for your attention. Our new work DINO, which is based on DAB-DETR, is available now: https://github.com/IDEACVR/DINO. You can refer to this repo for Swin Transformer supports.

For the first question, you are right, and it seems a bug in our implementations. For the second, we only use PE(xy) as positional queries, see [this line](https://github.com/IDEA-opensource/DAB-DETR/blob/main/models/DAB_DETR/transformer.py#L238), which will...

Can you provide more information about your device, environments (like the cuda version and PyTorch version), and commands used?

It seems like a problem with your environment. Can you provide the env and the command you used?

Thanks for your question. This discussion may be helpful: https://github.com/facebookresearch/detr/issues/101

The Transformer arch is GPU-mem cost. You may need GPUs with more memory. For the RAM, it seems like 32GB is not enough. One way to alleviate this is to...

See #23 for fine-tune details. You can ignore pretrained checkpoints if you want to train DINO on your custom datasets from scratch.

Thanks for pointing out the problem. We will correct them and update a manual for custom training later.