flower feat(baselines) Add FLANDERS baseline

Issue

Implements #2595

Description

Implementing FLANDERS baseline into Flower.

Nov 20 '23 17:11 edogab33

Thanks for the comments @jafermarq! It's still a WIP, so lot of stuff have to be adjusted :)

If you'd like to add some plots to your readme, please do so in a new directory named _static placed in the same directory as your README.

Got it!

We think it's better not to include the logs from the experiments (I see currently you are including the outputs/ directory.

Yeah, first time using hydra and I didn't realized it created output files. Now everything should be fine (I've disabled the creation of files from base.yaml).

Ideally, all the dependencies needed are added to the pyproject.toml so the environment can be built easily by simply doing poetry install. Depending on what packages you use, this might get tricky to achieve actually. If that's the case, it's fine provide additional instructions on how to install those packages after doing poetry install. Then you can install additional ones via the standard pip install <> method.

I've set up the environment in poetry now. I will check if everything works once I've finished to integrate the code by creating a new environment from scratch.

Ping me once there are some minimal instructions on how to setup the python environment and how to run an experiment. From that I can take a closer look at the code. Happy to discuss about your baseline anytime.

Got it! Thanks again.

Nov 21 '23 18:11 edogab33

Hi @jafermarq, the baseline should be ready. Let me know if there are any issue or suggestions, I'll take care of them.

Dec 19 '23 17:12 edogab33

Thanks for the review @jafermarq!

* when i ran your code I see quite a few messages of the type: `ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.` is this expected?

We use only one local epoch, hence it fails to converge. That's normal.

* Don't forget to run the formatting script by: (with your environment activated) `cd ..`, then `./dev/format-baseline.sh flanders`. This will automatically correct quite a few small issue, and highlight many more. Generally these are easy to fix.

* Once the above is completed, you'll have to run the ./dev/test-baseline.sh flanders` command. Likely other issues will be highlighted (again, easy to fix for the most part). But do ping me if some of them are not so easy to fix.

I'm on the process of correcting these issues and I'm ready for another review.

Dec 21 '23 15:12 edogab33

@jafermarq I see that here pylint doesn't pass (flanders/utils.py:11:0: E0401: Unable to import 'natsort' (import-error)), however locally it runs without warnings. Any clue?

Dec 23 '23 18:12 edogab33

@jafermarq I see that here pylint doesn't pass (flanders/utils.py:11:0: E0401: Unable to import 'natsort' (import-error)), however locally it runs without warnings. Any clue?

I added natsort to pyproject.toml and now the test pass.

Dec 27 '23 07:12 jafermarq

Hey @edogab33,

I still think there is a lot of value that can be added to this baseline if you could add at least one other strategy. All Flower baselines compare to at least one other strategy or method. Both Bulyan and Krum are implemented in Flower, so adding them as part of the --multirun shouldn't be too hard. What often works best is to use the defaults: structure in Hydra configs and have the strategy-specific part of the config in its own directory under conf/strategy. You can see how FedNova did this. If these experiments become more computationally expensive to run.

Thanks for the insight! Let me see how I can do what you suggest.

Another thing, is it posible to not have as part of the baseline the contents of flanders/dataset_files (i.e. can these be downloaded/preprocessed by the person running the baseline)? Those files add up to ~8MB which is not so little.

Unfortunately I don't remember where I've downloaded them and the House dataset was the result of a notebook found somewhere... so there were no automatic way to download it. The best I can do is to compress them and unzip locally in the main, so they should be way less heavy in the repo.

Merry Christmas!

Merry Christmas to you and thanks again for your review!

Dec 27 '23 12:12 edogab33

Hey @edogab33, I still think there is a lot of value that can be added to this baseline if you could add at least one other strategy. All Flower baselines compare to at least one other strategy or method. Both Bulyan and Krum are implemented in Flower, so adding them as part of the --multirun shouldn't be too hard. What often works best is to use the defaults: structure in Hydra configs and have the strategy-specific part of the config in its own directory under conf/strategy. You can see how FedNova did this. If these experiments become more computationally expensive to run.

Thanks for the insight! Let me see how I can do what you suggest.

Another thing, is it posible to not have as part of the baseline the contents of flanders/dataset_files (i.e. can these be downloaded/preprocessed by the person running the baseline)? Those files add up to ~8MB which is not so little.

Unfortunately I don't remember where I've downloaded them and the House dataset was the result of a notebook found somewhere... so there were no automatic way to download it. The best I can do is to compress them and unzip locally in the main, so they should be way less heavy in the repo.

Merry Christmas!

Merry Christmas to you and thanks again for your review!

ok! Let's not worry about the datasets for now. About using another strategy as point of comparison with FLANDERS. Maybe let's wait a bit until we have FLANDERS strategy API to match that of other strategies (like FedAvg). This is important else your custom server won't be compatible with other startegies.

Dec 27 '23 15:12 jafermarq

Hi @jafermarq, happy new year!

I think that FLANDERS APIs are ok now. I left a comment for the only thing that changes with respect to other strategies.

I also splitted the config files to fit other strategies as requested. I will make experiments with Krum and include them in the plots as soon as I have the time to do so; however we can go ahead with other issues in the meanwhile.

Let me know what you think. Thank you.

Jan 06 '24 16:01 edogab33

Hey @edogab33 , let me know if you'd like to discuss something related to the review above

Mar 04 '24 14:03 jafermarq

Hey @jafermarq, sorry for the wait. We've been busy running new experiments and reworking most of the paper, which means we've also changed the code. We'll be updating the arxiv soon so you can see our latest results. In the meantime, I'll get started on updating the code in this pull request

Mar 07 '24 16:03 edogab33

@jafermarq I've updated the code, now it's alligned with the most recent version that I've written. Let me update the README and test everything before starting a code review.

Mar 18 '24 17:03 edogab33

@jafermarq I've updated the code, now it's alligned with the most recent version that I've written. Let me update the README and test everything before starting a code review.

Amazing! Let me know when I should review it.

Mar 20 '24 10:03 jafermarq

Hi @edogab33 , i fixed a couple of small formatting errors. All tests pass. Let me know if there is something pending you'd like to add to your baseline!

Apr 24 '24 20:04 jafermarq

Hi @edogab33 , i fixed a couple of small formatting errors. All tests pass. Let me know if there is something pending you'd like to add to your baseline!

Thanks a lot @jafermarq. I'll add a new notebook and perform the experiments again.

Apr 25 '24 10:04 edogab33

Hi @edogab33 , i fixed a couple of small formatting errors. All tests pass. Let me know if there is something pending you'd like to add to your baseline!

Thanks a lot @jafermarq. I'll add a new notebook and perform the experiments again.

Great! Please ping me when ready or if you'd like me to look into something. 🙌

Apr 25 '24 20:04 jafermarq

What about this one @edogab33 ?

Jun 07 '24 11:06 jafermarq

@jafermarq I'm working on it right now. Experiments are almost done, I have to update the notebook for the plots and the README. Things about this project changed a lot, here's the new version: https://arxiv.org/pdf/2303.16668

Jun 07 '24 11:06 edogab33

@jafermarq should I rename the title of the PR in: feat(baselines) Add FLANDERS?

Jun 24 '24 10:06 edogab33

@jafermarq should I rename the title of the PR in: feat(baselines) Add FLANDERS?

yes please. maybe "feat(baselines) Add FLANDERS baseline"?

Jun 24 '24 10:06 jafermarq

Hi @jafermarq, everything is ready. However, a test is not passing. Can't get why

Jun 24 '24 17:06 edogab33

@jafermarq do you know what's the problem with the test?

Jul 06 '24 18:07 edogab33

@jafermarq do you know what's the problem with the test?

All seems good now. I'll review soon.

Jul 06 '24 18:07 jafermarq

@jafermarq do you know what's the problem with the test?

All seems good now. I'll review soon.

Thanks! :)

Jul 08 '24 14:07 edogab33

@jafermarq Thanks for the review! Hope the results are pretty much the same as mine.

I've addressed your issues and realized that if the .sh script is run instead of launching batches of experiments individually, you'll get all results into one file. At the same time, the notebook expects that each experiment has a different one (i.e. all_results_[dataset]_[flanders/no_flanders].csv). Now the notebook should automatically organize results into the files it expects.

Hope to hear from you soon

Jul 26 '24 10:07 edogab33

@jafermarq Thanks for the review! Hope the results are pretty much the same as mine.

I've addressed your issues and realized that if the .sh script is run instead of launching batches of experiments individually, you'll get all results into one file. At the same time, the notebook expects that each experiment has a different one (i.e. all_results_[dataset]_[flanders/no_flanders].csv). Now the notebook should automatically organize results into the files it expects.

Hope to hear from you soon

Thanks! but to generate the results you show in the read me should I run the two commands in the Expected Results section or should I run those in run.sh instead? I'm doing the former but please let me know.

Jul 26 '24 11:07 jafermarq

Thanks! but to generate the results you show in the read me should I run the two commands in the Expected Results section or should I run those in run.sh instead? I'm doing the former but please let me know.

Good point, that's an oversight from my side.

The best way --now that I've updated the notebook to divide the results automatically-- is to run the bash script. For this reason, I removed the former way to run the experiments to avoid mistakes. However, if you've already launched the experiments with that method it's perfectly fine, note that you should run them also for MNIST though.

By the way, as you suggested, I updated the main.py to create the output dir if not present.

@jafermarq

Jul 29 '24 12:07 edogab33