Rosetta
Rosetta copied to clipboard
Capability & Usability to Federated Learning
Our company is working on applying federated learning in blockchain scenarios. I did some investigation on Rosetta, and I am not sure of its capability & usability in federated learning settings.
Specifically, it would be great if you could help explaining following questions:
-
To what extent does Rosetta supports heterogeneous computation? Per my understanding, for horizontal FL, rosetta is able to achieve great performance benchmark. However in vertical FL settings, where each party holds several dimensions of the dataset, as well as corresponding weights, is Rosetta capable of handling training/evaluating process?
-
Is it possible to perform real "local compute" with Rosetta? In common federated learning practices, homomorphic encryption / secure aggregation are widely used to secure the training process, in order to ensure no privacy data can leak. My understanding is for Rosetta, instead of using homomorphic encryption, MPC protocols are used to secure multi-party computation. If we leverage FL guidelines, local compute is crucial. This means no sensitive data are sent to 3rd party, only the computed sums/ intermediate results / gradients can be sent to communication channel. For MPC solution, have you considered this scenario, or do you have plan to support them?
-
For blockchain scenarios, can we deploy Rosetta in a decentralized way? For the given examples, even if we only have 2 parties with data, to run the example , a 3rd party (coordinator) is required, this is a "centralized" component, if looking at a blockchain-point-of-view. Is it possible to remove the coordinator, and enables party-to-party setting? Also , to use FL in blockchain scenarios, we want to track the communication content of each parties, in order to make sure no sensitive data are directly sent out. This gurantee could be more convincing, if it comes from blockchain mechanism. However to achieve this , we need to intercept the data flow, and get the content of each communication. Is it possible in Rosetta?
Your questions seem to be very general. Besides, MPC
and the generic FL
are quite different in some ways while they are also closely related to each other. So I can only give some brief opinion here as far as I can.
- 1
To what extent does Rosetta supports heterogeneous computation?
YES. Rosetta supports datasets that are arbitrarily partitioned. You can refer to our examples in tutorials to check how we support the so-called VFL setting.
In typical MPC
style, the private input are encoded
into ciphertext
-like state from the beginning, and all the following computation are carried out on these ciphertext
by these participants collaboratively and securely. And there have been many researches in specific ML algorithms and specific data-partition types, many of which will be integrated into Rosetta as needed.
- 2
Is it possible to perform real "local compute" with Rosetta?
Just as I had said, all the non-public data are kept in ciphtertext
state during computation in MPC, so no sensitive data are sent to any party.
- 3
For blockchain scenarios, can we deploy Rosetta in a decentralized way?
In my opinion, blockchain and Rosetta can be combined orthogonally to support specific upper applications.
Currently, in 3-party protocol used in Rosetta, the P2
is just a helper to accelerate the computation, there is nothing sensitive this party alone can receive or derive from during the whole computation. This is unlike the typical so-called coordinator
who can decrypt and view some raw plaintext in FL setting.
Besides, there are also some MPC protocols that are even without such helper, while their performance will decline. We are also working on integrating such protocols.