pytorch-image-models
pytorch-image-models copied to clipboard
Add SWA support
With the recently Pytorch 1.6, adding SWA should be easier. Also, it improves the accuracy. ¿Are you going to add it?
I know avg_checkpoint.py does something similar. However, having SWA builtin inn the train script it's simpler if you know a priory that you'll use it.
@hal-314 I'm hesitant to add it because I haven't seen anything indicating it'd be better than heavy EMA + avg_checkpoints and a decent LR schedule. I have yet to see any ImageNet results with it that would challenge best training recipes I'm aware of. Most of the examples I saw in the blog, paper were CIFAR and those results don't always translate to larger dataset training. I've seen lots of CIFAR -> ImageNet flops.
I do think SWA would show improvement over baseline SGD training, question is how much vs other options. I'm open to adding it, but I'd like to see some competitive results with it before throwing on the pile with everything else. That'll take some time and effort given the training length so I'm not planning to do this just yet. I'm open to contrib, but I'd like to see some good results on at least one of the smaller ResNets.
I wanted to ask a similar question, but I totally agree with your explanation. Now, my question is, how do I use the avg_checkpoint? Assuming I am saving checkpoints, whenever there are performance gains, how do I use the avg_checkpoint on the fly or use it at all? Thanks, in advance?
Similar question. I found the best 10 checkpoints are always very similar in performance when using EMA. So avg_checkpoints can't give performance improvement in this case.
that's separate script:
https://github.com/rwightman/pytorch-image-models/blob/master/avg_checkpoints.py