Results 9 comments of zzz

Hi is that possible your actual data disk is away from you machine, so the latency is due to data pipeline. We have trained tiny version for a few times,...

I am using 2x2 TPU for NesT-tiny (I think it is called single TPU).

What do you mean by not working, totally disconverge? Or performance not good. It will be useful if you are provide more information. Two suggestions: - It is better to...

Did you train these methods from scratch or finetune with their pre-trained checkpoints. It will be important sometimes. Our scripts currently only train from scratch, but it can be easy...

Hi, from our experiments, we do not our methods have convergence issue. it will be great if you can provide more training detailed info so I can help, e.g. what...

Hi, @szagoruyko , Could you please let me know exact how do you do mean/std process? @ibmua 's code at here [https://github.com/ibmua/Breaking-Cifar](url) computes different mean and std with fb-resnet-torch provided...

I see. I tested your code. The error rate decreased. Thanks @ibmua

@szagoruyko I see. I will have a try of that. It is really trick. Thanks @ibmua for observing that. I have tried so many different tests using WRN code to...

Cool. I will check your code.