Andy Brock comments

Results 27 comments of


                                            Andy Brock

Training results(IS and FID) are not good as yours with same training process

Hi Qi, There can be a substantial amount of variance in the time to convergence for a model (I only had time to train one with this codebase as I...

Training results(IS and FID) are not good as yours with same training process

Hmm, Looking at your logs against (![image](https://user-images.githubusercontent.com/7751273/57980292-837c9480-7a21-11e9-81ca-14cb406c34d1.png)) this does look like it's well outside the variance I would expect. My models are also trained on 8xV100 so I don't think...

Training results(IS and FID) are not good as yours with same training process

That plot looks to me like EMA isn't kicking in, or that the batchnorm stats being used alongside the EMA are stale. Can you pull the FID and IS stats...

Training results(IS and FID) are not good as yours with same training process

Hi feifei-Liu, Thanks, this is helpful! Can you also post a link to your training logs, and the script you launched with? I'm working on tracking down any possible differences...

Request: make cuDNN optional?

Hi Alec, You're right that cuDNN is just a speed boost-the reason I put it as a dependency is that these models can be huge (training the discriminative models from...

GUI crashing when trying to render

Hmm, I'm going to guess that's a windows 10 related bug; any chance you can try running it on a Windows 7 or a Linux machine? A quick google shows...

theano and lasagne version

Most likely Theano 0.8 and Lasagne 0.2, IIRC.

Are the configurations for the classification correct?

IIRC (it's been a long time) those config patterns are for inference. Batch size of 50, and a double-up augmentation strategy, each epoch on a Maxwell Titan X took 6+...

Output format

The 5.8 there looks like error on the validation split (so training on 45,000 images, testing on 5,000 from the train set). If you want to use the CIFAR-10 test...

output = net(input,w,*arch)

You don't need to modify the input size, just change the pooling size on [line 997](https://github.com/ajbrock/SMASH/blob/master/SMASH.py#L997) and [line 1193](https://github.com/ajbrock/SMASH/blob/master/SMASH.py#L1193) to be the appropriate size for your network. The current version...