FedNLP
FedNLP copied to clipboard
Accuracy and loss didn't improve for FedAvg on 20news
I am trying to reproduce the FedAvg results on 20news data. However, the FedAvg algorithm on 20news task seems not working. Comparing with the centralized run, the eval accuracy and loss of FedAvg did not make any improvement after many rounds (eval acc 0.063, eval loss 2.969). The results and experiment setting can be checked here: https://wandb.ai/haofuml/fednlp_bl?workspace=user-haofuml
@haofuml I see. I will check this issue this weekend.
I am trying to reproduce the FedAvg results on 20news data. However, the FedAvg algorithm on 20news task seems not working. Comparing with the centralized run, the eval accuracy and loss of FedAvg did not make any improvement after many rounds (eval acc 0.063, eval loss 2.969). The results and experiment setting can be checked here: https://wandb.ai/haofuml/fednlp_bl?workspace=user-haofuml
Hi, any progress on this? Thanks!
@shubham-malaviya set the optimizer to "fedOPT", it should work
I'm facing the same issue. Is there any reason for the lack of improvement with fedAVG? Do you have a workaround? @chaoyanghe
Similar, FedAvg has no improvement on the acc and loss. The potential reason could be the parameter update from client to server stuck somewhere.
3335 2022-03-13,22:21:21.453 - {FedAVGAggregator.py (45)} - add_local_trained_result(): add_model. index = 1 3335 2022-03-13,22:21:21.453 - {FedAvgServerManager.py (52)} - handle_message_receive_model_from_client(): b_all_received = False 3335 2022-03-13,22:21:23.862 - {FedAVGAggregator.py (45)} - add_local_trained_result(): add_model. index = 2 3335 2022-03-13,22:21:23.870 - {FedAvgServerManager.py (52)} - handle_message_receive_model_from_client(): b_all_received = False 3335 2022-03-13,22:21:28.058 - {FedAVGAggregator.py (45)} - add_local_trained_result(): add_model. index = 3 3335 2022-03-13,22:21:28.064 - {FedAvgServerManager.py (52)} - handle_message_receive_model_from_client(): b_all_received = False 3335 2022-03-13,22:21:28.365 - {FedAVGAggregator.py (45)} - add_local_trained_result(): add_model. index = 4 3335 2022-03-13,22:21:28.372 - {FedAvgServerManager.py (52)} - handle_message_receive_model_from_client(): b_all_received = True 3335 2022-03-13,22:21:39.657 - {FedAVGAggregator.py (70)} - aggregate(): len of self.model_dict[idx] = 5 3335 2022-03-13,22:21:39.960 - {FedAVGAggregator.py (87)} - aggregate(): aggregate time cost: 11 3335 2022-03-13,22:21:39.961 - {tc_transformer_trainer.py (137)} - eval_model(): len(test_dl) = 942, n_batches = 942 indexes of clients: [37 26 78 91 49]3335 2022-03-13,22:21:49.148 - {tc_transformer_trainer.py (180)} - eval_model(): best_accuracy = 0.009692 3335 2022-03-13,22:21:49.148 - {tc_transformer_trainer.py (188)} - eval_model(): {'mcc': -0.012776380246762163, 'tp': 0, 'tn': 0, 'fp': 0, 'fn': 0, 'acc': 0.009426447158789167, 'eval_loss': 3.0077073222259556} 3335 2022-03-13,22:21:49.149 - {FedAVGAggregator.py (97)} - client_sampling(): client_indexes = [37 26 78 91 49]
The log is like this. It seems like the model on the server has never received the updated parameters from the clients. I think this is the reason causing the results never get updated.