jellyfish
jellyfish copied to clipboard
Make jellyfish wasm compatibility
This issue aims to achieve wasm compatibility:
- [ ] finish #111
- [x] document slowdown (if any) and limitations in the wasm version.
- [x] add wasm compilation check in CI
- [ ] pay extra attention to entropy sampling (as
/dev/random
is no longer available, the usage of PRNG, seed length etc has to be more conservative chosen)
While doing #318, on my local M1,
native benchmark:
$ cargo flamegraph --root --bench=plonk-benches --features test-srs
proving time for Bls12_381, PlonkType::TurboPlonk: 52120 ns/gate
proving time for Bls12_377, PlonkType::TurboPlonk: 53111 ns/gate
proving time for Bn254, PlonkType::TurboPlonk: 44123 ns/gate
proving time for BW6_761, PlonkType::TurboPlonk: 215765 ns/gate
proving time for Bls12_381, PlonkType::UltraPlonk: 79688 ns/gate
proving time for Bls12_377, PlonkType::UltraPlonk: 78137 ns/gate
proving time for Bn254, PlonkType::UltraPlonk: 69546 ns/gate
proving time for BW6_761, PlonkType::UltraPlonk: 341935 ns/gate
verifying time for Bls12_381, PlonkType::TurboPlonk: 2504441 ns
verifying time for Bls12_377, PlonkType::TurboPlonk: 2597979 ns
verifying time for Bn254, PlonkType::TurboPlonk: 3948429 ns
verifying time for BW6_761, PlonkType::TurboPlonk: 8686679 ns
verifying time for Bls12_381, PlonkType::UltraPlonk: 2658983 ns
verifying time for Bls12_377, PlonkType::UltraPlonk: 2570262 ns
verifying time for Bn254, PlonkType::UltraPlonk: 1644525 ns
verifying time for BW6_761, PlonkType::UltraPlonk: 9560779 ns
batch verifying time for Bls12_381, PlonkType::TurboPlonk, 1000 proofs: 15505 ns/proof
batch verifying time for Bls12_377, PlonkType::TurboPlonk, 1000 proofs: 16257 ns/proof
batch verifying time for Bn254, PlonkType::TurboPlonk, 1000 proofs: 16841 ns/proof
batch verifying time for BW6_761, PlonkType::TurboPlonk, 1000 proofs: 31831 ns/proof
batch verifying time for Bls12_381, PlonkType::UltraPlonk, 1000 proofs: 17502 ns/proof
batch verifying time for Bls12_377, PlonkType::UltraPlonk, 1000 proofs: 18538 ns/proof
batch verifying time for Bn254, PlonkType::UltraPlonk, 1000 proofs: 16461 ns/proof
batch verifying time for BW6_761, PlonkType::UltraPlonk, 1000 proofs: 37605 ns/proof
targeting wasm:
$ cargo flamegraph --root -o wasm-flamegraph.svg --bench=plonk-benches --no-default-features --features test-srs -- --target=wasm32-unknown-unknown
proving time for Bls12_381, PlonkType::TurboPlonk: 228241 ns/gate
proving time for Bls12_377, PlonkType::TurboPlonk: 226454 ns/gate
proving time for Bn254, PlonkType::TurboPlonk: 173796 ns/gate
proving time for BW6_761, PlonkType::TurboPlonk: 925769 ns/gate
proving time for Bls12_381, PlonkType::UltraPlonk: 349392 ns/gate
proving time for Bls12_377, PlonkType::UltraPlonk: 338768 ns/gate
proving time for Bn254, PlonkType::UltraPlonk: 275681 ns/gate
proving time for BW6_761, PlonkType::UltraPlonk: 1245619 ns/gate
verifying time for Bls12_381, PlonkType::TurboPlonk: 2516504 ns
verifying time for Bls12_377, PlonkType::TurboPlonk: 2743795 ns
verifying time for Bn254, PlonkType::TurboPlonk: 1684900 ns
verifying time for BW6_761, PlonkType::TurboPlonk: 11598887 ns
verifying time for Bls12_381, PlonkType::UltraPlonk: 2768262 ns
verifying time for Bls12_377, PlonkType::UltraPlonk: 3081320 ns
verifying time for Bn254, PlonkType::UltraPlonk: 1837012 ns
verifying time for BW6_761, PlonkType::UltraPlonk: 12966108 ns
batch verifying time for Bls12_381, PlonkType::TurboPlonk, 1000 proofs: 52437 ns/proof
batch verifying time for Bls12_377, PlonkType::TurboPlonk, 1000 proofs: 52356 ns/proof
batch verifying time for Bn254, PlonkType::TurboPlonk, 1000 proofs: 47650 ns/proof
batch verifying time for BW6_761, PlonkType::TurboPlonk, 1000 proofs: 91287 ns/proof
batch verifying time for Bls12_381, PlonkType::UltraPlonk, 1000 proofs: 61210 ns/proof
batch verifying time for Bls12_377, PlonkType::UltraPlonk, 1000 proofs: 60756 ns/proof
batch verifying time for Bn254, PlonkType::UltraPlonk, 1000 proofs: 55146 ns/proof
batch verifying time for BW6_761, PlonkType::UltraPlonk, 1000 proofs: 106205 ns/proof
The caveats is that I'm not using any wasm runtime like wasmtime
, so I'm not sure how accurate is the wasm target binary benchmark, will investigate more later.
But on the surface, it seems that inside wasm is 2~2.5x slower.
You sure that you actually run wasm code? If you don't provide any runtime how the wasm code is executed? My guess is that --target wasm32-unknow-unknow
is ignored and you run native bench but by specifying --no-default-features
you disabled parallel
feature and this is the cause of worse performance. From my experiments, it seems that wasm performance is ~10x worse comparing to native without rayon.
yeah, I suspect you are right. so we really need to run inside wasmtime to get an accurate benchmark? @mike1729
I believe so. What I did to get perf data that you can then feed to flamegraph was:
-
cargo build --release --target wasm32-unknown-unknow
-
perf record -k mono wasmtime --profile=jitdump prover.wasm
-
perf inject --jit --input perf.data --output perf.jit.data
Then you can either inspect perf.jit.data
with perf perf report --input perf.jit.data
or use some frontend.
Note however, that the above method works without parallel
feature as it is hard to get rayon
working in wasm.