training_policies icon indicating copy to clipboard operation
training_policies copied to clipboard

Allowed optimizers for Image Classification using Pytorch

Open mrmhodak opened this issue 1 year ago • 5 comments

Currently, Image Classification/LARS/Pytorch combination lists no compliant optimizers.

However, 2.2 submission by Habana used what looks like FusedLars optimizer (their own implementation) for Pytorch on ResNet.

Does that mean that built-in FusedLars is MLPerf compliant for Image Classification?

mrmhodak avatar Mar 02 '23 16:03 mrmhodak

Actually, the optimizers that we want to use are FusedLAMB, FusedSGD imported from apex.optimizers:

from apex.optimizers import FusedLAMB, FusedSGD

Would these be MLPerf-compliant?

mrmhodak avatar Mar 07 '23 03:03 mrmhodak

@sgpyc to keep me honest. LAMB is not an allowed optimizer for RN50. Only LARS and SGD are allowed. https://github.com/mlcommons/training/tree/master/image_classification#optimizer

The rules already call allow apex.optimizers.FusedSGD here

nv-rborkar avatar Mar 10 '23 19:03 nv-rborkar

3/23 update: The team is working on our own Fused Lars implementation for Pytorch. I should have a code to share for next week's meeting

mrmhodak avatar Mar 23 '23 15:03 mrmhodak

Here is the Fused Lars code: https://github.com/ROCmSoftwarePlatform/apex/blob/master/apex/optimizers/fused_lars.py

Please review that it is good to use

mrmhodak avatar Mar 30 '23 15:03 mrmhodak

Thanks, implementation looks good as long as nesterov momentum is not used (reference doesn't use nesterov)

nv-rborkar avatar Apr 13 '23 16:04 nv-rborkar