vision icon indicating copy to clipboard operation
vision copied to clipboard

Add DETR model

Open xiaohu2015 opened this issue 3 years ago • 3 comments

related to https://github.com/pytorch/vision/issues/2707, work with @oke-aditya @triple-Mu

  • [ ] Backbone: ResNet50
  • [x] Transformer: Encoder + Decoder
  • [ ] Position embed
  • [ ] Detection head & loss
  • [ ] label assignment
  • [ ] datapipes

xiaohu2015 avatar Apr 29 '22 02:04 xiaohu2015

Sorry for an Early poke at the PR, but I would like to know why we are not using nn.TransformerEncoder layer? Although I'm yet to dive into technical details and aspects.

oke-aditya avatar May 09 '22 19:05 oke-aditya

Sorry for an Early poke at the PR, but I would like to know why we are not using nn.TransformerEncoder layer? Although I'm yet to dive into technical details and aspects.

For the offical code, they do some modifcation of nn.TransformerEncoder, it maybe affect the performance.

xiaohu2015 avatar May 10 '22 02:05 xiaohu2015

@xiaohu2015 I hope you are well. It's been a while since this PR has seen any action. I wonder if you plan to continue slowly working on it or you think it's unlikely to do this in H2. No pressure. Thanks! :)

datumbox avatar Sep 14 '22 14:09 datumbox

@datumbox @xiaohu2015 Is there any progress on this? If not, I would love to help!

deepwilson avatar Oct 27 '22 15:10 deepwilson

@deepwilson I believe this is up for grabs as @xiaohu2015 doesn't have the time to complete it now. It would be awesome if we can continue the work. :)

datumbox avatar Oct 27 '22 15:10 datumbox