acceleration-program icon indicating copy to clipboard operation
acceleration-program copied to clipboard

Privacy preserving machine learning using MPC

Open tkmct opened this issue 1 year ago • 5 comments

Open Task RFP for Privacy preserving machine learning inference using MPC

Executive Summary

  • Project Overview: In this project, we want to see current state of the privacy preserving machine learning (PPML) especially utilizing MPC by building real-world ML model using existing MPC-ML framework.

Project Details

  • Scope of Work:
    • Decide what machine learning model to build
    • Build PPML using MPC-ML framework (use pre-trained model)
  • Expected Outcomes: Source code of the software
  • Technical Requirements: EzPC

Qualifications

  • Skills Required: machine learning
  • Preferred Qualifications: MPC

Administrative Details

  • Grant Liaison(s): Name, GitHub, email of the person(s) responsible for evaluating and keeping track of this project.
  • Estimated Project Duration: 1month
  • Project Complexity: Medium

Additional Information

  • Relevant Tags: MPC, ML
  • Reference Material: Links to any upstream issues, documentation, or other resources that provide more context.

Submission Details

  • Proposal Deadline: The deadline for submitting proposals is the end of this round of the Acceleration Program. Refer to current round
  • Submission Instructions: Please submit your proposal as an issue and link back to this issue in your proposal. Refer to proposal template for more details.

tkmct avatar Oct 04 '23 05:10 tkmct

I just finished PSE's Summer ZK Fellowship program and I have some previous experience in ML.

I want to work on this task.

In the past I worked on Federated Brain Tumor Segmentation from a privacy enabled ML POV.

thogiti avatar Oct 06 '23 10:10 thogiti

I just finished PSE's Summer ZK Fellowship program and I have some previous experience in ML.

I want to work on this task.

In the past I worked on Federated Brain Tumor Segmentation from a privacy enabled ML POV.

Hi @thogiti Kindly send out your proposal as issue per the template

NOOMA-42 avatar Oct 09 '23 13:10 NOOMA-42

Hey @thogiti , update?

mitsu1124 avatar Oct 19 '23 06:10 mitsu1124

Hi @mitsu1124. Apologies for delay. I got caught up in some stuff. But I did make some notes after doing some self-studying about this project. I will write them down and put it in a proposal and post it here for your review and feedback in the next one week.

Thank you. Apologies again for a delay.

thogiti avatar Oct 19 '23 17:10 thogiti

Proposal: Privacy-Preserving Machine Learning Inference using MPC

Executive Summary

Project Name: Trustless MPC Inferences for Advanced Machine Learning Models

In this project, we aim to extend the capabilities of privacy-preserving machine learning (PPML) by implementing trustless Multi-Party Computation (MPC) inferences on larger and more complex models like Whisper, GPT-2, Mistral 7B, and Gemma 2B. Building on our experience with smaller models such as ResNet and CISER, we will leverage the Crypten library and explore the newly developed mpz library to demonstrate the effectiveness of MPC in maintaining privacy without compromising model performance.

Project Overview

Our focus is to push the boundaries of PPML using MPC by applying it to advanced machine learning models. By ensuring privacy during the inference phase, we aim to enable secure and confidential utilization of state-of-the-art models in sensitive applications. This will also encrypt the model, protecting against weight leaks and whitebox attacks.

Project Details

Scope of Work

  1. Model Selection: Choose larger and complex models for MPC implementation, such as Whisper, GPT-2, Mistral 7B, and Gemma 2B.
  2. MPC Implementation: Extend our work on trustless MPC inferences using the Crypten library to the selected models.
  3. Library Exploration: Explore the mpz library as an early adopter and integrate it into our MPC implementations. Also, consider other libraries like MPCFormer & EzPC.
  4. Evaluation: Assess the performance and privacy-preserving capabilities of the MPC implementations on larger models.

Milestones

Milestone 1: Model, Library Selection, and Preliminary Setup

  • Duration: 1 week
  • Deliverables:
    • Selection of suitable larger models for MPC implementation.
    • Apples-to-apples comparison of Crypten, mpz, EzPC, etc.
    • Setup of the development environment and cloud infrastructure for the PoC.

Milestone 2: MPC Implementation on Selected Models

  • Duration: 2 weeks
  • Deliverables:
    • Implementation of trustless MPC inferences on the selected models using the library that's been narrowed down.
    • Integration of the mpz library into the MPC implementations (if applicable).

Milestone 3: Evaluation and Documentation

  • Duration: 1 week
  • Deliverables:
    • Evaluation of the performance and privacy-preserving capabilities of the implemented MPC inferences.
    • Comprehensive documentation of the implementation process, challenges faced, and solutions adopted.

Team

Name Email GitHub
Gunit Malik [email protected] @guni7
Saurabh Chalke [email protected] @saurabhchalke

Team Experience

The team has been deeply involved in the zk space for over a year. We have previously built privacy-preserving versions of zk proof delegation based on the zksaas paper, utilizing the packed secret-sharing MPC primitive. The team has prior experience in AI, having worked with computer vision, SVM, language models, and with PyTorch/TensorFlow.

Administrative Details

  • Estimated Project Duration: 1 month
  • Project Complexity: Medium

Current Progress

We have successfully implemented trustless MPC inferences on smaller models like ResNet, MNIST, and CISER using the Crypten library. This experience has laid the foundation for tackling larger and more complex models in this project.

@NOOMA-42

saurabhchalke avatar Mar 06 '24 03:03 saurabhchalke