vision [RFC] New Ops in TorchVision

🚀 The feature

Consider adding the following operators in TorchVision:

Layers

There is a separate ticket for tracking common layers: #4333

Operators

[ ] SoftNMS
[ ] masks_to_boxes for N-dimensional masks

Losses

There is a separate ticket tracking Losses proposals: #2980

Schedulers & Optimizers (Core upstreaming)

Feb 13 '22 12:02 datumbox

maybe I can implement some features, eg DropBlock layer.

Feb 13 '22 12:02 xiaohu2015

@xiaohu2015 wow, I literally just slacked you to see if you are interested 😄

Great! If you want to send a PR for DropBlock it would be awesome. Let me know if you want me to create an issue for it so that other contributors know you are working on it already (else you can create one yourself or just rely on the PR; up to you!).

Feb 13 '22 12:02 datumbox

hey @datumbox, can I take up SoftNMS implementation?

Apr 08 '22 06:04 lezwon

@lezwon Thanks for offering help!

The SoftNMS would have to be implemented in C++ and CUDA because this is where we implement the standard NMS. Some additional discussion would be required to see exactly how this will be implemented and what its API would look like. As you understand this is quite a lot of work and it's not guaranteed that the feature will be merged. If you are up for it, we can discuss more. Just wanted to give you a heads up that this is a more risky feature to work on.

If the above doesn't sound too appealing, there are features listed at #5410 you might find fun to work on. Have a look and let me know if anything interests you. :)

Apr 08 '22 08:04 datumbox

@datumbox Sure thing :) I'll pick up something from #5410

Apr 08 '22 09:04 lezwon

I want to try DropConnect Layer. Any other info / implementation I could look to will be great :smiley:

Jul 02 '22 19:07 oke-aditya

@oke-aditya There are a few reasons we haven't added DropConnect. According to the paper, here is Dropout: r = m * a(W v) and here is DropConnect: r = a ((M * W) u).

a: activation u: input M: bernoulli mask W: weights

As you see the M on the latter case is applied on the W, which means it makes for an awkward design of a layer. I believe you will have to implement different versions of it for Linear and Convs. Another issue with it is that it's quite old and not often used in SOTA research. These are some of the reasons we decided not to add it, at least on phase 1 and 2 of Batteries Included.

Jul 04 '22 13:07 datumbox

vision vision copied to clipboard

[RFC] New Ops in TorchVision

🚀 The feature

Layers

Operators

Losses

Schedulers & Optimizers (Core upstreaming)

vision
vision copied to clipboard