stackoverflow_lr trainer problem

Open ZSL98 opened this issue 4 years ago • 1 comments

I am using FedML to train stackoverflow_lr, and I adopt the hyperparameters recommended in the original paper 'Adaptive Federated Optimization' (learning rate = 100, optimizer = sgd) and can not get expected results. I wonder if the implementation of the trainer in FedML is different from TFF. I noticed that you used clip_grad_norm_ to avoid nan loss, otherwise the loss can not even drop. Is this operation optional or also used in TFF? I would be appreciate if you can give me some advice on the training process.

Apr 12 '21 11:04 ZSL98

@ZSL98 have you finally addressed the problem? It seems our results are here are reasonable: https://doc.fedml.ai/simulation/benchmark/BENCHMARK_simulation.html

Aug 17 '22 00:08 chaoyanghe

The issue has already been addressed.

Oct 24 '23 19:10 fedml-dimitris