Menshykov comments

Results 63 comments of


                                            Menshykov

Inconsistencies with nn

So I've been trying to research the grouped theme, but just found out today that these guys already went deep into this thing with a lot of hardware. https://arxiv.org/pdf/1605.06489v1.pdf Proves...

Okay, now that ResNeXt is out https://arxiv.org/pdf/1611.05431.pdf , I'm hoping that I'm not the only one who understands importance of native grouped convolutions here? Since groups are exactly the only...

Inconsistencies with nn

https://arxiv.org/pdf/1611.05431.pdf **Performance**. For simplicity we use Torch’s built-in grouped convolution implementation, without special optimization. We note that this implementation was brute-force and not parallelization-friendly. On 8 GPUs of NVIDIA M40,...

Inconsistencies with nn

Actually, taking a closer look, Kaiming's paper doesn't have a lot of novelty vs https://arxiv.org/pdf/1605.06489v1.pdf which I've already linked to, basically it's a follow-up on that study, more of a...

Inconsistencies with nn

NVidia said they're planning to release some implementation of groups in their next CuDNN.

Inconsistencies with nn

https://developer.nvidia.com/cudnn so grouped convs are now available in CuDNN v7.

What I will test next

It's a good idea to test influence of LSUV init time batch sizes on large networks with highly variant data. It seems in the paper that you've only tested this...

What I will test next

Yes, it actually is not very shuffled. Which means that I have to use a larger batch here to get something more like I would get with a smaller one...

What I will test next

It would be great if you noted time it took to converge different stuff. Both epochs and actual time.

What I will test next

Yeah, I guess that would just take import time start = time.time() end = time.time() , logging spent time to a seperate file during saves and reading during loads. Not...