feat(baselines) Add FLANDERS baseline
Issue
Implements #2595
Description
Implementing FLANDERS baseline into Flower.
Thanks for the comments @jafermarq! It's still a WIP, so lot of stuff have to be adjusted :)
If you'd like to add some plots to your readme, please do so in a new directory named
_staticplaced in the same directory as your README.
Got it!
We think it's better not to include the logs from the experiments (I see currently you are including the
outputs/directory.
Yeah, first time using hydra and I didn't realized it created output files. Now everything should be fine (I've disabled the creation of files from base.yaml).
Ideally, all the dependencies needed are added to the
pyproject.tomlso the environment can be built easily by simply doingpoetry install. Depending on what packages you use, this might get tricky to achieve actually. If that's the case, it's fine provide additional instructions on how to install those packages after doingpoetry install. Then you can install additional ones via the standardpip install <>method.
I've set up the environment in poetry now. I will check if everything works once I've finished to integrate the code by creating a new environment from scratch.
Ping me once there are some minimal instructions on how to setup the python environment and how to run an experiment. From that I can take a closer look at the code. Happy to discuss about your baseline anytime.
Got it! Thanks again.
Hi @jafermarq, the baseline should be ready. Let me know if there are any issue or suggestions, I'll take care of them.
Thanks for the review @jafermarq!
* when i ran your code I see quite a few messages of the type: `ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.` is this expected?
We use only one local epoch, hence it fails to converge. That's normal.
* Don't forget to run the formatting script by: (with your environment activated) `cd ..`, then `./dev/format-baseline.sh flanders`. This will automatically correct quite a few small issue, and highlight many more. Generally these are easy to fix. * Once the above is completed, you'll have to run the ./dev/test-baseline.sh flanders` command. Likely other issues will be highlighted (again, easy to fix for the most part). But do ping me if some of them are not so easy to fix.
I'm on the process of correcting these issues and I'm ready for another review.
@jafermarq I see that here pylint doesn't pass (flanders/utils.py:11:0: E0401: Unable to import 'natsort' (import-error)), however locally it runs without warnings. Any clue?
@jafermarq I see that here pylint doesn't pass (
flanders/utils.py:11:0: E0401: Unable to import 'natsort' (import-error)), however locally it runs without warnings. Any clue?
I added natsort to pyproject.toml and now the test pass.
Hey @edogab33,
I still think there is a lot of value that can be added to this baseline if you could add at least one other strategy. All Flower baselines compare to at least one other strategy or method. Both Bulyan and Krum are implemented in Flower, so adding them as part of the
--multirunshouldn't be too hard. What often works best is to use thedefaults:structure in Hydra configs and have the strategy-specific part of the config in its own directory underconf/strategy. You can see how FedNova did this. If these experiments become more computationally expensive to run.
Thanks for the insight! Let me see how I can do what you suggest.
Another thing, is it posible to not have as part of the baseline the contents of
flanders/dataset_files(i.e. can these be downloaded/preprocessed by the person running the baseline)? Those files add up to ~8MB which is not so little.
Unfortunately I don't remember where I've downloaded them and the House dataset was the result of a notebook found somewhere... so there were no automatic way to download it. The best I can do is to compress them and unzip locally in the main, so they should be way less heavy in the repo.
Merry Christmas!
Merry Christmas to you and thanks again for your review!
Hey @edogab33, I still think there is a lot of value that can be added to this baseline if you could add at least one other strategy. All Flower baselines compare to at least one other strategy or method. Both Bulyan and Krum are implemented in Flower, so adding them as part of the
--multirunshouldn't be too hard. What often works best is to use thedefaults:structure in Hydra configs and have the strategy-specific part of the config in its own directory underconf/strategy. You can see how FedNova did this. If these experiments become more computationally expensive to run.Thanks for the insight! Let me see how I can do what you suggest.
Another thing, is it posible to not have as part of the baseline the contents of
flanders/dataset_files(i.e. can these be downloaded/preprocessed by the person running the baseline)? Those files add up to ~8MB which is not so little.Unfortunately I don't remember where I've downloaded them and the House dataset was the result of a notebook found somewhere... so there were no automatic way to download it. The best I can do is to compress them and unzip locally in the main, so they should be way less heavy in the repo.
Merry Christmas!
Merry Christmas to you and thanks again for your review!
ok! Let's not worry about the datasets for now. About using another strategy as point of comparison with FLANDERS. Maybe let's wait a bit until we have FLANDERS strategy API to match that of other strategies (like FedAvg). This is important else your custom server won't be compatible with other startegies.
Hi @jafermarq, happy new year!
I think that FLANDERS APIs are ok now. I left a comment for the only thing that changes with respect to other strategies.
I also splitted the config files to fit other strategies as requested. I will make experiments with Krum and include them in the plots as soon as I have the time to do so; however we can go ahead with other issues in the meanwhile.
Let me know what you think. Thank you.
Hey @edogab33 , let me know if you'd like to discuss something related to the review above
Hey @jafermarq, sorry for the wait. We've been busy running new experiments and reworking most of the paper, which means we've also changed the code. We'll be updating the arxiv soon so you can see our latest results. In the meantime, I'll get started on updating the code in this pull request
@jafermarq I've updated the code, now it's alligned with the most recent version that I've written. Let me update the README and test everything before starting a code review.
@jafermarq I've updated the code, now it's alligned with the most recent version that I've written. Let me update the README and test everything before starting a code review.
Amazing! Let me know when I should review it.
Hi @edogab33 , i fixed a couple of small formatting errors. All tests pass. Let me know if there is something pending you'd like to add to your baseline!
Hi @edogab33 , i fixed a couple of small formatting errors. All tests pass. Let me know if there is something pending you'd like to add to your baseline!
Thanks a lot @jafermarq. I'll add a new notebook and perform the experiments again.
Hi @edogab33 , i fixed a couple of small formatting errors. All tests pass. Let me know if there is something pending you'd like to add to your baseline!
Thanks a lot @jafermarq. I'll add a new notebook and perform the experiments again.
Great! Please ping me when ready or if you'd like me to look into something. 🙌
What about this one @edogab33 ?
@jafermarq I'm working on it right now. Experiments are almost done, I have to update the notebook for the plots and the README. Things about this project changed a lot, here's the new version: https://arxiv.org/pdf/2303.16668
@jafermarq should I rename the title of the PR in: feat(baselines) Add FLANDERS?
@jafermarq should I rename the title of the PR in:
feat(baselines) Add FLANDERS?
yes please. maybe "feat(baselines) Add FLANDERS baseline"?
Hi @jafermarq, everything is ready. However, a test is not passing. Can't get why
@jafermarq do you know what's the problem with the test?
@jafermarq do you know what's the problem with the test?
All seems good now. I'll review soon.
@jafermarq do you know what's the problem with the test?
All seems good now. I'll review soon.
Thanks! :)
@jafermarq Thanks for the review! Hope the results are pretty much the same as mine.
I've addressed your issues and realized that if the .sh script is run instead of launching batches of experiments individually, you'll get all results into one file. At the same time, the notebook expects that each experiment has a different one (i.e. all_results_[dataset]_[flanders/no_flanders].csv).
Now the notebook should automatically organize results into the files it expects.
Hope to hear from you soon
@jafermarq Thanks for the review! Hope the results are pretty much the same as mine.
I've addressed your issues and realized that if the .sh script is run instead of launching batches of experiments individually, you'll get all results into one file. At the same time, the notebook expects that each experiment has a different one (i.e.
all_results_[dataset]_[flanders/no_flanders].csv). Now the notebook should automatically organize results into the files it expects.Hope to hear from you soon
Thanks! but to generate the results you show in the read me should I run the two commands in the Expected Results section or should I run those in run.sh instead? I'm doing the former but please let me know.
Thanks! but to generate the results you show in the read me should I run the two commands in the
Expected Resultssection or should I run those inrun.shinstead? I'm doing the former but please let me know.
Good point, that's an oversight from my side.
The best way --now that I've updated the notebook to divide the results automatically-- is to run the bash script. For this reason, I removed the former way to run the experiments to avoid mistakes. However, if you've already launched the experiments with that method it's perfectly fine, note that you should run them also for MNIST though.
By the way, as you suggested, I updated the main.py to create the output dir if not present.
@jafermarq