Dmytro Mishkin

Results 209 comments of Dmytro Mishkin

@wangxianliang Well, not and not planning to do it in the nearest future. Reasons: 1)This benchmark originally is test of "we propose cool thing and test it on CIFAR-100" papers....

I have used tanh, because LSUV worked the worst for it. My experience with ImageNet confirms, that any batch size > 32 is OK for LSUV, if data is shuffled....

@ibmua for epochs it is very easy - 320K everywhere :) As for times - it is hard (I am lazy) because some of the trainings consist of lots of...

@ibmua > import time > start = time.time() > end = time.time() Are you really thinking I do it in python? You are underestimating my laziness ;) Everything I do...

CReLU is on the way https://arxiv.org/pdf/1603.05201.pdf

@wangg12 It is already there https://github.com/ducha-aiki/caffenet-benchmark/blob/master/Activations.md worse than ELU. But may be I have made a mistake, you can check me in https://github.com/ducha-aiki/caffenet-benchmark/blob/master/prototxt/activations/caffenet128_lsuv_SELU.prototxt#L92

@simbaforrest I don't use fixed initialization, I use LSUV init https://arxiv.org/abs/1511.06422 which is stated on each page of this repo :) What is in prototxt, does not matter, because LSUV...

Well, it is still possible that SELU is a very architecture dependent.

Why not BVLC-GoogLeNet (inception 1) there? Is there some reason for it?