halo2
halo2 copied to clipboard
[Discussion] Peak memory estimation
Recently @han0110 did an analysis of peak memory estimation for halo2 proving, with a script implementation that takes a ConstraintSystem
and estimates the peak memory of the prover. I ported that script to the zkevm-circuits and it can be seen here: https://github.com/privacy-scaling-explorations/zkevm-circuits/blob/3bbc757a20d9d9f7759db47c096d5395361bee74/zkevm-circuits/src/bin/stats/halo2_stats.rs#L232
Recently I ran some benchmarks with this test circuit in https://github.com/privacy-scaling-explorations/halo2/blob/1650a0b07fbbfeebab71f5650ccb62ccd20de0ff/halo2_proofs/tests/frontend_backend_split.rs to measure the peak memory and verify how accurate the estimation is, and I'm happy to report that the estimation is very accurate!
These are the characteristics of the circuit (where wf
is the WIDTH_FACTOR
and can be parametrized):
- max_degree = 6
- num_fixed_columns = 8 * wf
- num_advice_columns = 4 * wf
- num_instance_columns = 1 * wf
- num_permutation_columns = 4 * wf
- num_lookups = 1 * wf
These are the results for k = 10,12,14,16,18
and wf = 1,2,3,4,5,6
(memory values in MiB):
k = 10, wf = 1, max_mem_estimate = 7
k = 10, wf = 1, max_mem = 7
k = 10, wf = 2, max_mem_estimate = 13
k = 10, wf = 2, max_mem = 13
k = 10, wf = 3, max_mem_estimate = 18
k = 10, wf = 3, max_mem = 19
k = 10, wf = 4, max_mem_estimate = 24
k = 10, wf = 4, max_mem = 24
k = 10, wf = 5, max_mem_estimate = 29
k = 10, wf = 5, max_mem = 30
k = 10, wf = 6, max_mem_estimate = 35
k = 10, wf = 6, max_mem = 36
k = 12, wf = 1, max_mem_estimate = 29
k = 12, wf = 1, max_mem = 30
k = 12, wf = 2, max_mem_estimate = 52
k = 12, wf = 2, max_mem = 53
k = 12, wf = 3, max_mem_estimate = 74
k = 12, wf = 3, max_mem = 75
k = 12, wf = 4, max_mem_estimate = 96
k = 12, wf = 4, max_mem = 98
k = 12, wf = 5, max_mem_estimate = 118
k = 12, wf = 5, max_mem = 120
k = 12, wf = 6, max_mem_estimate = 141
k = 12, wf = 6, max_mem = 143
k = 14, wf = 1, max_mem_estimate = 119
k = 14, wf = 1, max_mem = 122
k = 14, wf = 2, max_mem_estimate = 208
k = 14, wf = 2, max_mem = 212
k = 14, wf = 3, max_mem_estimate = 297
k = 14, wf = 3, max_mem = 302
k = 14, wf = 4, max_mem_estimate = 386
k = 14, wf = 4, max_mem = 392
k = 14, wf = 5, max_mem_estimate = 475
k = 14, wf = 5, max_mem = 482
k = 14, wf = 6, max_mem_estimate = 564
k = 14, wf = 6, max_mem = 572
k = 16, wf = 1, max_mem_estimate = 478
k = 16, wf = 1, max_mem = 490
k = 16, wf = 2, max_mem_estimate = 834
k = 16, wf = 2, max_mem = 850
k = 16, wf = 3, max_mem_estimate = 1190
k = 16, wf = 3, max_mem = 1210
k = 16, wf = 4, max_mem_estimate = 1546
k = 16, wf = 4, max_mem = 1570
k = 16, wf = 5, max_mem_estimate = 1902
k = 16, wf = 5, max_mem = 1930
k = 16, wf = 6, max_mem_estimate = 2258
k = 16, wf = 6, max_mem = 2290
k = 18, wf = 1, max_mem_estimate = 1912
k = 18, wf = 1, max_mem = 1960
k = 18, wf = 2, max_mem_estimate = 3336
k = 18, wf = 2, max_mem = 3400
k = 18, wf = 3, max_mem_estimate = 4760
k = 18, wf = 3, max_mem = 4840
k = 18, wf = 4, max_mem_estimate = 6184
k = 18, wf = 4, max_mem = 6280
k = 18, wf = 5, max_mem_estimate = 7608
k = 18, wf = 5, max_mem = 7720
k = 18, wf = 6, max_mem_estimate = 9032
k = 18, wf = 6, max_mem = 9160
The estimation is ~98% of the measured peak memory. You can reproduce these tests from this branch https://github.com/privacy-scaling-explorations/halo2/tree/test/bench-mem-peak using this script https://github.com/privacy-scaling-explorations/halo2/blob/test/bench-mem-peak/halo2_proofs/bench.sh
Discussion
I think it would be nice to add this estimation into the code. Maybe it can be a halo2_backend
function that receives a ConstraintSystemMid
and returns the estimated memory? This way different backends can report different memory estimations.
Another point of discussion is that it would be great to figure out the new memory estimation formula from https://github.com/privacy-scaling-explorations/halo2/pull/274
One important thing aside of https://github.com/privacy-scaling-explorations/halo2/pull/274 would be to check in Halo2 if we can drop polynomials earlier than fn scope ends and drop gets triggered.
On this way we might be able (if the compiler is not already doing so) to reduce it a bit.