flower
flower copied to clipboard
Multiple clients control from one thread
Is it possible to control more than one client with one fl.client.start_client() command, to avoid creation of multiple threads?
For example for a VisionClassificationClient client:
clients = [ VisionClassificationClient(client_setting.cid,model,xy_train,xy_test) for client in range(10)] fl.client.start_client(server_address, clients)
Thanks for the question @vvv94 . What you're suggesting isn't possible yet, but it could be a nice improvement.
Some background: Flower currently uses one process per client because it isolates clients from each other. That allows systems researchers to create custom scenarios, e.g., to limit the resources (network/compute/...) one client has over the others. This also isolates the ML framework (e.g., TensorFlow), which can cause issues if multiple clients run in the same process.
We are however thinking about ways to make it easier to start many clients at once, your suggestion could be a good candidate. Are you mainly concerned about having too many processes/threads running at the same time? Or is it more about having a convenient way of starting multiple clients from Python?
Thank you for your interest @danieljanes . Running a large number of processes/threads on the same time could indeed cause many problems. For example, I am unable to run in the same machine more than 20 clients. I am trying to create an example to start multiple clients using multiprocessing/threading, but it does not seem to work. I also believe that starting multiple clients from the same python thread and regulating the training/evaluating of clients in batches could also be a very useful feature. For example, if we want to start 100 clients in one machine it would be useful train/evaluate in batches of 10 and once all clients have completed training the server will receive the updates. In that way, the hardware limitations of any machine will not be an issue.
Thanks for providing more details @vvv94 . I agree that having a way to start clients in a more resourceful way could be a great feature, especially for research and simulations.
We are thinking about different ways to do this:
- Stop clients which are not active entirely (i.e., kill the process) - that would mean that only clients which are used for training/evaluation are running in an active process, other clients would "awake" every now and then to connect to the server and check if there is work for them
- Start multiple clients in the same process, as you suggested (and perhaps even batch them). This could be a good way for experienced users to run even more clients at the same time. One challenge is the isolation of individual clients because it can cause issues when you have multiple TensorFlow/PyTorch models running in the same process, but that'll be something the user has to take care of.
Would you be interested in contributing? The implementation of flwr.client.start_client
should be fairly easy to read, perhaps it can provide a starting point.
similar problem when I hope to start 40 clients in the same time in the same machine, however only 25 clients work