acceleration-program icon indicating copy to clipboard operation
acceleration-program copied to clipboard

Porting real-world ML model(s) into ZK

Open socathie opened this issue 2 years ago • 4 comments
trafficstars

Open Task RFP for Porting real-world ML model(s) into ZK

Executive Summary

  • Project Overview: Porting machine learning models that are actually used in real-world applications into ZK

Project Details

  • Scope of Work:
  1. Select a small (such that a powerful laptop can prove) ML model that has been used in real-world applications (no toy problems like MNIST or Boston House Price, etc.)
  2. Construct the model in ZK using any DSL/compiler of your choice.
  3. Create a minimal demo UI so that people can interact with the model and generate proofs.
  4. Measure the performance of the original version vs. the ZK (quantized) version.
  5. Document the process, including any difficulties you encounter.
  • Expected Outcomes: a demo app, a repo of the codebase, an article documenting the process
  • Technical Requirements: Tensorflow/Pytorch, any ZK DSL

Qualifications

  • Skills Required: knowledge of Tensorflow/Pytorch, ability to port ML models into ZK DSLs (any of your choice), proficient writing skills
  • Preferred Qualifications: professional experience/advanced degree in machine learning is a plus

Administrative Details

  • Grant Liaison: Cathie So (@socathie, [email protected])
  • Estimated Project Duration: 120 hours
  • Project Complexity: Medium, expected to conduct independent research

Additional Information

Submission Details

  • Proposal Deadline: The deadline for submitting proposals is the end of this round of the Acceleration Program. Refer to current round
  • Submission Instructions: Please submit your proposal as an issue and link back to this issue in your proposal. Refer to proposal template for more details.

socathie avatar Sep 25 '23 11:09 socathie

Dear Dr. Cathie @socathie and Paul @NOOMA-42 ,

I hope to work on this task. My idea now is to predict the probability distribution of hourly rainfall from polarimetric radar measurements. The use of radar to assess rainfall is widely used in agricultural production. Currently prevalent models include decision trees, random forests, and XGBoost. Proven assessment results can be combined with on-chain Oracle to enrich the data sources for on-chain decision-making. I think it's a scenario that makes practical sense, and the model is relatively mild in complexity, making it more conducive to implementation on a laptop.

I am wondering your opinions on this idea and look forward to your comments.

Have a nice day!

Best regards, Li

only4sim avatar Jan 09 '24 09:01 only4sim

Hi Li,

sorry for missing out your question. Having XGBoost will be really helpful. This sounds to me very valid. Would you be able to submit a proposal?

Dear Dr. Cathie @socathie and Paul @NOOMA-42 ,

I hope to work on this task. My idea now is to predict the probability distribution of hourly rainfall from polarimetric radar measurements. The use of radar to assess rainfall is widely used in agricultural production. Currently prevalent models include decision trees, random forests, and XGBoost. Proven assessment results can be combined with on-chain Oracle to enrich the data sources for on-chain decision-making. I think it's a scenario that makes practical sense, and the model is relatively mild in complexity, making it more conducive to implementation on a laptop.

I am wondering your opinions on this idea and look forward to your comments.

Have a nice day!

Best regards, Li

nooma-42 avatar Jan 24 '24 16:01 nooma-42

Hi Li,

sorry for missing out your question. Having XGBoost will be really helpful. This sounds to me very valid. Would you be able to submit a proposal?

Dear Dr. Cathie @socathie and Paul @NOOMA-42 , I hope to work on this task. My idea now is to predict the probability distribution of hourly rainfall from polarimetric radar measurements. The use of radar to assess rainfall is widely used in agricultural production. Currently prevalent models include decision trees, random forests, and XGBoost. Proven assessment results can be combined with on-chain Oracle to enrich the data sources for on-chain decision-making. I think it's a scenario that makes practical sense, and the model is relatively mild in complexity, making it more conducive to implementation on a laptop. I am wondering your opinions on this idea and look forward to your comments. Have a nice day! Best regards, Li

Hi Paul.

Thank you very much for your kind answer! I am thrilled you like the idea. I will make a proposal to give more explanation for the idea and the plan. I am very excited to have a chance to work with the PSE team and look forward to receiving your comments.

Best wishes, Li

only4sim avatar Jan 25 '24 07:01 only4sim

Hi @socathie and @NOOMA-42 ,

I submitted my proposal and used the time to conduct preliminary experiments to prune the model used. You can find the relevant data in the Preliminary Results section. Here's a link to my proposal https://github.com/privacy-scaling-explorations/acceleration-program/issues/39 and I look forward to your comments.

Cheers, Li

only4sim avatar Feb 29 '24 00:02 only4sim