models
models copied to clipboard
Default learning rate in TF2 SSD MobileNet V2 config file is way too high. Is it a typo?
The learning rate set in the TF2 SSD MobileNet V2 config file is 10x higher than that of the other SSD MobileNet models. This causes loss during training to get extremely high. Is it a typo?
The default ssd_mobilenet_v2_320x320_coco17_tpu-8.config
configuration has this for the learning rate:
optimizer {
momentum_optimizer: {
learning_rate: {
cosine_decay_learning_rate {
learning_rate_base: .8
total_steps: 50000
warmup_learning_rate: 0.13333
warmup_steps: 2000
}
}
momentum_optimizer_value: 0.9
Meanwhile, the FPNLite version ( ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.config
) has this:
optimizer {
momentum_optimizer: {
learning_rate: {
cosine_decay_learning_rate {
learning_rate_base: .08
total_steps: 50000
warmup_learning_rate: .026666
warmup_steps: 1000
}
}
momentum_optimizer_value: 0.9
When I train with the default values in the ssd_mobilenet_v2_320x320_coco17_tpu-8.config
file, the huge learning rate throws training way off. When I change the values from .8
to .08
and .13333
to .013333
, training works much better. I think whoever wrote the config file missed a decimal point.
Loss graph BEFORE changing learning rate values (the loss is way higher)
Loss graph AFTER changing learning rate values
This is a wonderful question.. I wonder why no one has commented on this ? Any suggestions
The learning rate set in the TF2 SSD MobileNet V2 config file is 10x higher than that of the other SSD MobileNet models. This causes loss during training to get extremely high. Is it a typo?
The default
ssd_mobilenet_v2_320x320_coco17_tpu-8.config
configuration has this for the learning rate:optimizer { momentum_optimizer: { learning_rate: { cosine_decay_learning_rate { learning_rate_base: .8 total_steps: 50000 warmup_learning_rate: 0.13333 warmup_steps: 2000 } } momentum_optimizer_value: 0.9
Meanwhile, the FPNLite version (
ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.config
) has this:optimizer { momentum_optimizer: { learning_rate: { cosine_decay_learning_rate { learning_rate_base: .08 total_steps: 50000 warmup_learning_rate: .026666 warmup_steps: 1000 } } momentum_optimizer_value: 0.9
When I train with the default values in the
ssd_mobilenet_v2_320x320_coco17_tpu-8.config
file, the huge learning rate throws training way off. When I change the values from.8
to.08
and.13333
to.013333
, training works much better. I think whoever wrote the config file missed a decimal point.Loss graph BEFORE changing learning rate values (the loss is way higher)
Loss graph AFTER changing learning rate values
Did you get an answer for your this query?
Hi @Annieliaquat , yes I did! They tried to implement a change to fix it, but it got rejected. https://github.com/tensorflow/models/pull/10531
The high learning rate is intended for training with TPUs. You can change it manually back to a lower learning rate if you're just training with a CPU or GPU.
Hi @Annieliaquat , yes I did! They tried to implement a change to fix it, but it got rejected. #10531
The high learning rate is intended for training with TPUs. You can change it manually back to a lower learning rate if you're just training with a CPU or GPU.
okay thanks alot