Zhenheng TANG comments

Results 21 comments of


                                            Zhenheng TANG

In the pathological Non-IID setting, the samples distribution on clients may be unbalanced even the `balance` is True.

I also find that the most current codes of dirichlet partition method cannot generate a balanced client datasets. This may cause training harder.

fix the inefficient problem of combine_batches in main_fedavg.py

> @wizard1203 could you help to check if this improves the performance? Seems it can.

Could you add more servers in FedAvg for faster training speed?

> BytePS is for data center-based distributed training, while FedML (e.g., FedAvg) is edge-based distributed training. The particular assumptions of FL include: > > 1. heterogeneous data distribution cross devices...

Could you add more servers in FedAvg for faster training speed?

> FedML supports multiple parameter servers for the communication efficiency via hierarchical FL and decentralized FL . > In hierarchical FL, there are group parameter servers that split the total...

Could you add more servers in FedAvg for faster training speed?

@chaoyanghe Thanks for your detailed explanation. Maybe I can try to complete it by myself, and when I finish it I would like to push it to your master branch.

Could you add more servers in FedAvg for faster training speed?

> @wizard1203 Do you mean modifying based on this code? > https://github.com/FedML-AI/FedML/tree/master/fedml_experiments/distributed/fedavg @chaoyanghe No, maybe it needs to base on those codes on fedml_core. Whatever, I may try to do...

How can I get the error type when occur failure in simulation fit round?

I met the same problems when using simulation... There are always failures during each round.

How can I get the error type when occur failure in simulation fit round?

![image](https://user-images.githubusercontent.com/22996426/189470503-ce2e1fe5-8086-464a-a818-db504ad14119.png) I apparently raise RuntimeError in the function fit(). But there is only the print of 5 failures without the error log. ![image](https://user-images.githubusercontent.com/22996426/189470549-e6917a30-b5e5-4f32-b21c-20d0e4e29975.png)

How can I get the error type when occur failure in simulation fit round?

@Mirian-Hipolito Thanks, let me try it.

FedAvg accuracy stucks under 50

@AbdulMoqeet @hangxu0304 Hi, could you have another try with a smaller number of local epochs, e.g. E=1. The large epochs usually make training harder to converge.