FedScale
FedScale copied to clipboard
Pull mechanism for client-server communication
FedScale is currently built under an assumption that the central server communicates with clients based on push mechanism where the server initiates signals (handshake/training/notTraining/etc) to available devices. Is it possible to consider a pull based system where the device initiates the communication by sending requests to the server to ask for next step actions on a periodic basis? In a realistic setting, the device would periodically ping the server when its training criteria is met (enough data, sufficient battery, app open, etc.), and the server would respond with the model gradients for federated training if the client is selected.
Hello. Thanks for trying FedScale. We want to note that FedScale now is indeed a pull-based system and is able to support real deployment. For example, clients/executors periodically ping the server for next step actions. Just like what you are describing. :)
Hi @fanlai0990 from my understanding, the server samples a selection of clients, then uses the client executors to push those messages to the client for signals like CLIENT_TRAIN and SHUT_DOWN. This seems like a push mechanism to me.
There might be some misleading variable name. In FedScale, each executor drives the execution of its client. The executor ping(pull) the aggregator for the next steps, which may receive CLIENT_TRAIN for selected clients, or DUMMY_MSG with doing nothing. But essentially, the client is polling the aggregator.
Thanks for the clarification!
btw, are you considering a pulled-based new device checkin system?
@AmberLJC yes, we imagine when the device is in a state that it's ready for training and has new data to train on, it will periodically check-in with the server. Then the server decides whether the device should train based on its training history. This makes more sense in an async system.
@AmberLJC @fanlai0990 I think this is what we were discussing earlier re:having a separate chekcin server that decouples devices checking in and selector picking them.