Add DETR model
related to https://github.com/pytorch/vision/issues/2707, work with @oke-aditya @triple-Mu
- [ ] Backbone: ResNet50
- [x] Transformer: Encoder + Decoder
- [ ] Position embed
- [ ] Detection head & loss
- [ ] label assignment
- [ ] datapipes
Sorry for an Early poke at the PR, but I would like to know why we are not using nn.TransformerEncoder layer? Although I'm yet to dive into technical details and aspects.
Sorry for an Early poke at the PR, but I would like to know why we are not using
nn.TransformerEncoderlayer? Although I'm yet to dive into technical details and aspects.
For the offical code, they do some modifcation of nn.TransformerEncoder, it maybe affect the performance.
@xiaohu2015 I hope you are well. It's been a while since this PR has seen any action. I wonder if you plan to continue slowly working on it or you think it's unlikely to do this in H2. No pressure. Thanks! :)
@datumbox @xiaohu2015 Is there any progress on this? If not, I would love to help!
@deepwilson I believe this is up for grabs as @xiaohu2015 doesn't have the time to complete it now. It would be awesome if we can continue the work. :)