efficientnet Half precision training very slow and returning nan loss

Half precision training very slow and returning nan loss

Open acmilannesta opened this issue 5 years ago • 3 comments

Based on tf doc I used the "mixed_float16" policy to train the efficientnet model, but the training become extremely slow almost 10 times slower.

https://www.tensorflow.org/api_docs/python/tf/keras/mixed_precision/experimental/Policy?hl=en

here are some of the codes:

from tensorflow.keras.mixed_precision import experimental as mixed_precision
policy = mixed_precision.Policy('mixed_float16')
mixed_precision.set_policy(policy)
K.set_epsilon(1e-3)

I used the tfkeras.py to call the efficientnet backbone. Do I need to put the above code in the "tfkeras.py" script?

Feb 26 '20 19:02 acmilannesta

Based on tf doc I used the "mixed_float16" policy to train the efficientnet model, but the training become extremely slow almost 10 times slower.

https://www.tensorflow.org/api_docs/python/tf/keras/mixed_precision/experimental/Policy?hl=en

here are some of the codes:
from tensorflow.keras.mixed_precision import experimental as mixed_precision
policy = mixed_precision.Policy('mixed_float16')
mixed_precision.set_policy(policy)
K.set_epsilon(1e-3)
I used the tfkeras.py to call the efficientnet backbone. Do I need to put the above code in the "tfkeras.py" script?

I met the same problem, have you found some solution?

Mar 08 '20 09:03 ChengTsang

Tried to upgrade tf to 2.2-rc0

Mar 13 '20 18:03 acmilannesta

@acmilannesta Did upgrading to TF 2.2 solved this problem? What dataset are you training on?

Apr 13 '20 11:04 AshishSardana

efficientnet efficientnet copied to clipboard

Half precision training very slow and returning nan loss

efficientnet
efficientnet copied to clipboard