PFLlib icon indicating copy to clipboard operation
PFLlib copied to clipboard

(New data partition strategy) Extended Dirichlet strategy by combining Pathological heterogeneous setting and Practical heterogeneous setting in pFL.

Open liyipeng00 opened this issue 1 year ago • 5 comments

Recently, I find one new data partition strategy called Extended Dirichlet strategy ~~~ ours :), which could be added in this repo.

It combines the two common partition strategies (i.e., Quantity-based class imbalance and Diribution-based class imbalance in Li et al. (2022) or Pathological heterogeneous setting and Practical heterogeneous setting in zhang et al. (2023)) to generate arbitrarily heterogeneous data. The difference is to add a step of allocating classes (labels) to determine the number of classes per client (denoted by $C$) before allocating samples via Dirichlet distribution (with concentrate parameter $\alpha$).

The issue is from FedLab. The implementation is in convergence. You can find more details in Convergence Analysis of Sequential Federated Learning on Heterogeneous Data. [Figure: Row 1: $C=2$ with $\alpha=0.1$, $\alpha=1.0$, $\alpha=10.0$; Row 2: $C=5$ with $\alpha=0.1$, $\alpha=1.0$, $\alpha=10.0$; Row 3: $C=10$ with $\alpha=0.1$, $\alpha=1.0$, $\alpha=10.0$; ]

Li, Q., Diao, Y., Chen, Q., & He, B. (2022, May). Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE) (pp. 965-978). IEEE.

Zhang, J., Hua, Y., Wang, H., Song, T., Xue, Z., Ma, R., & Guan, H. (2023, June). FedALA: Adaptive local aggregation for personalized federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 9, pp. 11237-11244).

liyipeng00 avatar Nov 03 '23 06:11 liyipeng00

You can contribute to our project by submitting a pull request that adds the Extended Dirichlet strategy. We may add it when we have free time.

TsingZ0 avatar Nov 04 '23 11:11 TsingZ0

Thanks for your approval. I'm happy to contribute to this repo. Since I'm not familiar how to pull requests, it may cost some time. By the way, we find that the first implementation of Dir-Partition comes from "Bayesian nonparametric federated learning of neural networks", which could be clarified in the README.md.

liyipeng00 avatar Nov 04 '23 12:11 liyipeng00

^o^/, I have added ExDir successfully. I have only added some codes, so it is safe to add this strategy to the original code.

One example: MNIST, num_clients=10, num_classes=10, C=5 and alpha=100.0

Note that here we set min_require_size_per_label = max(C * num_clients // num_classes // 5, 1), so it can be expected that there are some clients whose number of labels is 4 (less than 5). You can set it bigger to satisfy your requirements, which may increase searching time in some cases.

image

liyipeng00 avatar Nov 06 '23 07:11 liyipeng00

Nice work! We will review it several weeks later, after the CVPR deadline.

TsingZ0 avatar Nov 12 '23 05:11 TsingZ0

Best of luck with your CVPR paper!

liyipeng00 avatar Nov 12 '23 08:11 liyipeng00

Sorry for the late reply due to my busy schedule. I only have time to check PR these days. Since PFLlib has moved forward with massive changes, your original PR is unable to be directly merged. Could you please update your PR to match the latest version? Thanks for your time!

TsingZ0 avatar Apr 18 '24 09:04 TsingZ0

Thanks for your approval. I have updated the pull request, with Extended Dirichlet strategy added. Feel free to change the code to meet the style of PFLlib, and just call me if issues appear.

python generate_MNIST.py noniid - exdir

I would be very grateful, if you could add some statements to introduce exdir in the README.md.

One simple example

This strategy combines the popular Dirichlet-based data partition strategy with Quantity-based class imbalance.

Thanks for your approval again.

liyipeng00 avatar Apr 19 '24 07:04 liyipeng00

Thank you for your update, I'll check it as soon as possible.

TsingZ0 avatar Apr 19 '24 09:04 TsingZ0

All done, please check it.

TsingZ0 avatar Apr 23 '24 08:04 TsingZ0

Thanks for your patience and kindness. I have checked it and have no further problems.

liyipeng00 avatar Apr 24 '24 03:04 liyipeng00