dropblock icon indicating copy to clipboard operation
dropblock copied to clipboard

inconsistency with the original paper

Open duducheng opened this issue 6 years ago • 9 comments

Hello, thanks for your nice code!

I found there were 2 inconsistencies with the original paper, and they are very easy to fix indeed:

  1. the gamma: in the original paper, all the block_mask are complete squares (or cubes), sinces its mask are only sampled on the central parts.
  2. in the paper, it said the channels use different masks, while in your implement they use the same.

I just figure them out, actually I do not know whether they are effective tricks, there are insufficient details discussed in the paper :)

duducheng avatar Jan 24 '19 08:01 duducheng

The gamma issue is a minor thing but I can have a look at it.

The channels share the same mask in the paper.

miguelvr avatar Jan 24 '19 10:01 miguelvr

“We experimented with a shared DropBlock mask across different feature channels or each feature channel has its DropBlock mask. Algorithm 1 corresponds to the latter, which tends to work better in our experiments.” (page 2 bottom line)

duducheng avatar Jan 24 '19 10:01 duducheng

Sure, that is easily fixable

Expect it soon

Edit: you can also do a PR if you want

miguelvr avatar Jan 24 '19 14:01 miguelvr

Hi, Any updates on this? Best

huyvnphan avatar Jul 12 '19 13:07 huyvnphan

Hi, Any updates on this? Best

I haven't had much free time to deal with this, but I will review and accept merge requests

miguelvr avatar Jul 15 '19 08:07 miguelvr

I also found some difference between paper and code.

JarvisKevin avatar Aug 10 '19 01:08 JarvisKevin

To solve this issue, you could have a look at this folk(only for DropBlock2D)

Eliza-and-black avatar Dec 04 '21 08:12 Eliza-and-black

To solve this issue, you could have a look at this folk(only for DropBlock2D)

I would encourage you to do a pull request

miguelvr avatar Dec 04 '21 13:12 miguelvr

If you do look at the code linked above, note that mask_center is not initialized on the device, so the part where nn.ZeroPad2d is called will by default run on the CPU. For me, since I was training on a GPU, this slowed down a single forward call (of my model which uses many Dropblocks) from .15 seconds to 3 seconds.

Screen Shot 2022-01-24 at 11 03 11 PM

JohnDLee avatar Jan 25 '22 04:01 JohnDLee