Zhenheng TANG

Results 21 comments of Zhenheng TANG

@hangxu0304 @AbdulMoqeet Hi, I find some possible reasons for this bug: Please check these codes: Original version: https://github.com/FedML-AI/FedML/blob/50d8a45d27675343a7b05a9b31279f6764d3f2ad/fedml_api/standalone/fedavg/fedavg_trainer.py#L45 Current version: https://github.com/FedML-AI/FedML/blob/8ccc24cf2c01b868988f5d5bd65f1666cf5526bc/fedml_api/standalone/fedavg/fedavg_api.py#L64 In the original version, the global model is deepcopied...

> This might be true for the standalone version. But in distributed version, each client only needs to update this global model and then upload it to the server. My...

@hangxu0304 For the distributed implementation, I find these differences: https://github.com/FedML-AI/FedML/blob/8ccc24cf2c01b868988f5d5bd65f1666cf5526bc/fedml_api/standalone/fedavg/my_model_trainer_classification.py#L44 https://github.com/FedML-AI/FedML/blob/50d8a45d27675343a7b05a9b31279f6764d3f2ad/fedml_api/distributed/fedavg/FedAVGTrainer.py#L29 In original version, there is no grad clip. However, in current version, there is a grad clip. This could...

I have ran out some new experiments results, verifying that the lack of deepcopy of global model in standalone will indeed induce bugs. But I cannot merge my codes to...

> @wizard1203 Can you please validate that this runs on [dev](https://github.com/FedML-AI/FedML/tree/dev/v0.7.0) after pulling the latest changes? @fedml-dimitris Hi, this dev/v0.7.0 has some changes on the core codes. It may not...

Did you run the original project, or re-implement the algorithm with your own codes? If you re-implement it, can you provide the accuracy of FedAvg and other baselines, check if...

Got it. So the FedAvg has the similar acc with VHL in your case. What is the neural network you use? A simple CNN may not have enough capacity to...

Could you try lr=0.1, or lr=0.3? And check the number of clients in total, number of clients each round, local epochs, the Non-IID degree of datasets.

I see. The accuracy of FedAvg can be improved further. You can try lr=0.3, or lr=0.1 with Momentum. And the first conv layer of resnet18 should be 3x3 instead of...

你好, 已经把这两个文件更新上去了,感谢你的关注~