Results 213 issues of Vadim Kantorov

Why would there be such a difference? Thanks!

Droput prob here is quite high - 0.5, and it is not discussed in the paper...

What's effective loss scaling? Does it sum or mean over classes? over batch size? How does it interact with distributed training? Is there anywhere scaling over the world size?

WSR50 config has 160 epochs: https://github.com/shenyunhang/DRN-WSOD-pytorch/blob/DRN-WSOD/projects/WSL/configs/PascalVOC-Detection/wsddn_WSR_50_DC5_1x.yaml R50 config has 28 epochs: https://github.com/shenyunhang/DRN-WSOD-pytorch/blob/DRN-WSOD/projects/WSL/configs/PascalVOC-Detection/wsddn_R_50_DC5_1x.yaml (this is true for WSR and R configs in gneral) Why is this the case? Thanks!

I am trying to parse a Caffe [model](https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto). I am getting an error related to _FieldSkipper: ``` /sequoia/data2/kantorov/wigwam_sequoia_gpu101_105/.wigwam/prefix/bin/luajit: ...01_105/.wigwam/prefix/share/lua/5.1/protobuf/decoder.lua:333: attempt to call a nil value stack traceback: ...01_105/.wigwam/prefix/share/lua/5.1/protobuf/decoder.lua:333: in function...

Hi, I'm trying to implement an L2-distances module. The code below prints: ``` function: 0x41cd5570 3.999683343136 ``` which means the gradient check doesn't pass. What am I missing? ``` lua...

@danielbear How could one ask for access for Playroom and Primitives datasets? Pixelwise labels are very useful for object-centric research. Thanks!

### 🚀 The feature, motivation and pitch It's possible that such an inplace fusion is already possible with dynamo, but if not, it's quite good to have for saving memory....

triage review
feature
triaged
oncall: pt2
module: aotdispatch

### 📚 The doc issue https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html?highlight=scaled_dot_product_attention#torch.nn.functional.scaled_dot_product_attention: The code example is rather not ordinary and contains a `return` statement outside of any function definition: is `not attn_mask` correct? should it not...

Hi! In https://github.com/MarcoForte/closed-form-matting/blob/master/closed_form_matting/closed_form_matting.py#L82 `cv2.dilate` is applied to the mask, while in `matting.tar.gz` is applied `imerode` If I understand correctlu, `imerode` is the opposite of `imdilate` (https://docs.opencv.org/3.4/db/df6/tutorial_erosion_dilatation.html). Any hints on why...