feat: some optimisations to circuits
So far:
- Found a bug in array filtering, so increased the constraints to fix it (doh!)
- Shifted the key validation requests array, instead of filter it, to save 10k constraints. Relies on the assumption that key validation requests are processed in order, taking the first N requests from the left of the array each time. Is that a valid assumption?
Changes to circuit sizes
Generated at commit: f85a41f0b6a7e784bd09fc3ff9d16b0a4e09929b, compared to commit: e29f042ccfb02f22ef63b3b82f43be2e4388902d
🧾 Summary (100% most significant diffs)
| Program | ACIR opcodes (+/-) | % | Circuit size (+/-) | % |
|---|---|---|---|---|
| public_kernel_tail | +445 ❌ | +0.04% | +445 ❌ | +0.00% |
| private_kernel_reset | -9,744 ✅ | -8.94% | -34,365 ✅ | -7.02% |
| private_kernel_reset_big | -7,152 ✅ | -8.76% | -20,101 ✅ | -7.16% |
| private_kernel_reset_medium | -5,856 ✅ | -8.61% | -12,969 ✅ | -7.36% |
| private_kernel_reset_small | -5,208 ✅ | -8.52% | -9,403 ✅ | -7.57% |
| private_kernel_reset_tiny | -4,884 ✅ | -8.46% | -7,620 ✅ | -7.76% |
Full diff report 👇
| Program | ACIR opcodes (+/-) | % | Circuit size (+/-) | % |
|---|---|---|---|---|
| public_kernel_tail | 1,006,778 (+445) | +0.04% | 9,128,428 (+445) | +0.00% |
| private_kernel_reset | 99,198 (-9,744) | -8.94% | 455,063 (-34,365) | -7.02% |
| private_kernel_reset_big | 74,486 (-7,152) | -8.76% | 260,535 (-20,101) | -7.16% |
| private_kernel_reset_medium | 62,129 (-5,856) | -8.61% | 163,270 (-12,969) | -7.36% |
| private_kernel_reset_small | 55,950 (-5,208) | -8.52% | 114,741 (-9,403) | -7.57% |
| private_kernel_reset_tiny | 52,862 (-4,884) | -8.46% | 90,566 (-7,620) | -7.76% |
Benchmark results
Metrics with a significant change:
- proof_construction_time_sha256_100_ms (16): 5,668 (-20%)
- avm_simulation_time_ms (FeeJuice:set_portal): 14.8 (+51%)
Detailed results
All benchmarks are run on txs on the Benchmarking contract on the repository. Each tx consists of a batch call to create_note and increment_balance, which guarantees that each tx has a private call, a nested private call, a public call, and a nested public call, as well as an emitted private note, an unencrypted log, and public storage read and write.
This benchmark source data is available in JSON format on S3 here.
Proof generation
Each column represents the number of threads used in proof generation.
| Metric | 1 threads | 4 threads | 16 threads | 32 threads | 64 threads |
|---|---|---|---|---|---|
| proof_construction_time_sha256_ms | 5,798 | 1,563 | 706 (-1%) | 746 (-3%) | 773 (+1%) |
| proof_construction_time_sha256_30_ms | 11,896 (-1%) | 3,186 (-1%) | 1,409 (-1%) | 1,425 (-1%) | 1,458 (-2%) |
| proof_construction_time_sha256_100_ms | 44,344 (-1%) | 12,292 (-3%) | :warning: 5,668 (-20%) | 5,567 (-2%) | 5,511 (-1%) |
| proof_construction_time_poseidon_hash_ms | 79.0 | 34.0 | 34.0 | 59.0 | 88.0 (+1%) |
| proof_construction_time_poseidon_hash_30_ms | 1,538 | 421 | 201 | 231 (+2%) | 269 (-1%) |
| proof_construction_time_poseidon_hash_100_ms | 5,680 | 1,538 | 676 (-1%) | 735 | 759 |
L2 block published to L1
Each column represents the number of txs on an L2 block published to L1.
| Metric | 4 txs | 8 txs | 16 txs |
|---|---|---|---|
| l1_rollup_calldata_size_in_bytes | 4,324 | 7,844 | 14,852 |
| l1_rollup_calldata_gas | 49,672 | 92,406 | 177,716 |
| l1_rollup_execution_gas | 1,294,335 | 2,042,037 | 3,869,386 |
| l2_block_processing_time_in_ms | 245 (-5%) | 468 (+7%) | 807 |
| l2_block_building_time_in_ms | 9,587 (-1%) | 18,666 | 37,320 (-1%) |
| l2_block_rollup_simulation_time_in_ms | 9,586 (-1%) | 18,665 | 37,320 (-1%) |
| l2_block_public_tx_process_time_in_ms | 7,976 | 16,832 | 35,388 (-1%) |
L2 chain processing
Each column represents the number of blocks on the L2 chain where each block has 8 txs.
| Metric | 3 blocks | 5 blocks |
|---|---|---|
| node_history_sync_time_in_ms | 3,135 (+7%) | 4,008 (+7%) |
| node_database_size_in_bytes | 12,714,064 | 16,846,928 |
| pxe_database_size_in_bytes | 16,254 | 26,813 |
Circuits stats
Stats on running time and I/O sizes collected for every kernel circuit run across all benchmarks.
| Circuit | simulation_time_in_ms | witness_generation_time_in_ms | input_size_in_bytes | output_size_in_bytes | proving_time_in_ms | proof_size_in_bytes | num_public_inputs | size_in_gates |
|---|---|---|---|---|---|---|---|---|
| private-kernel-init | 101 (+3%) | 404 (+3%) | 21,846 | 44,858 | N/A | N/A | N/A | N/A |
| private-kernel-inner | 168 (-5%) | 700 (-3%) | 72,545 | 45,005 | N/A | N/A | N/A | N/A |
| private-kernel-tail | 664 | 746 (-1%) | 52,710 | 52,256 | N/A | N/A | N/A | N/A |
| base-parity | 6.00 (-3%) | 601 (+3%) | 160 | 96.0 | 1,127 (-1%) | 13,188 | 19.0 | 65,536 |
| root-parity | 109 (-2%) | 156 (+24%) | 69,084 | 96.0 | 30,960 (-5%) | 13,188 | 19.0 | 4,194,304 |
| base-rollup | 2,964 (-1%) | 5,502 (-3%) | 187,824 | 664 | 99,481 (-2%) | 14,020 | 45.0 | 16,777,216 |
| root-rollup | 96.7 (-1%) | 117 (-2%) | 54,525 | 716 | 28,424 (-5%) | 13,988 | 44.0 | 4,194,304 |
| public-kernel-setup | 89.9 (-3%) | 2,802 (-2%) | 104,025 | 71,222 | 17,942 | 129,220 | 3,645 | 2,097,152 |
| public-kernel-app-logic | 105 (-1%) | 4,083 (+2%) | 104,025 | 71,222 | 10,382 (-1%) | 129,220 | 3,645 | 1,048,576 |
| public-kernel-tail | 569 (-1%) | 27,975 (-10%) | 409,190 | 16,414 | 107,094 (-4%) | 34,308 | 679 | 16,777,216 |
| private-kernel-reset-tiny | 183 (-6%) | 765 (-15%) | 67,646 (-1%) | 44,750 | N/A | N/A | N/A | N/A |
| private-kernel-tail-to-public | 2,625 (+11%) | 1,516 (-3%) | 931,564 | 1,697 | N/A | N/A | N/A | N/A |
| public-kernel-teardown | 85.3 (-2%) | 4,150 (-2%) | 104,025 | 71,222 | 18,860 (-10%) | 129,220 | 3,645 | 2,097,152 |
| merge-rollup | 59.5 (-1%) | N/A | 35,742 | 664 | N/A | N/A | N/A | N/A |
| undefined | N/A | N/A | N/A | N/A | 69,217 (-4%) | N/A | N/A | N/A |
Stats on running time collected for app circuits
| Function | input_size_in_bytes | output_size_in_bytes | witness_generation_time_in_ms | proof_size_in_bytes | proving_time_in_ms |
|---|---|---|---|---|---|
| ContractClassRegisterer:register | 1,344 | 11,731 | 340 (-1%) | N/A | N/A |
| ContractInstanceDeployer:deploy | 1,408 | 11,731 | 18.1 (-1%) | N/A | N/A |
| MultiCallEntrypoint:entrypoint | 1,920 | 11,731 | 473 (-1%) | N/A | N/A |
| FeeJuice:deploy | 1,376 | 11,731 | 387 | N/A | N/A |
| SchnorrAccount:constructor | 1,312 | 11,731 | 177 (-1%) | N/A | N/A |
| SchnorrAccount:entrypoint | 2,304 | 11,731 | 533 | N/A | N/A |
| Token:privately_mint_private_note | 1,280 | 11,731 | 215 (-3%) | N/A | N/A |
| FPC:fee_entrypoint_public | 1,344 | 11,731 | 25.4 (+3%) | N/A | N/A |
| Token:transfer | 1,312 | 11,731 | 404 (-2%) | N/A | N/A |
| AuthRegistry:set_authorized (avm) | 18,491 | N/A | N/A | 147,296 | 2,563 |
| FPC:prepare_fee (avm) | 22,958 | N/A | N/A | 147,360 | 3,262 (+1%) |
| Token:transfer_public (avm) | 61,614 | N/A | N/A | 147,360 | 18,255 (+2%) |
| AuthRegistry:consume (avm) | 41,719 | N/A | N/A | 147,328 | 7,730 (+2%) |
| FPC:pay_refund (avm) | 26,227 | N/A | N/A | 147,328 | 7,601 (+1%) |
| Benchmarking:create_note | 1,344 | 11,731 | 169 | N/A | N/A |
| SchnorrAccount:verify_private_authwit | 1,280 | 11,731 | 27.5 | N/A | N/A |
| Token:unshield | 1,376 | 11,731 | 636 (-1%) | N/A | N/A |
| FPC:fee_entrypoint_private | 1,376 | 11,731 | 830 (-1%) | N/A | N/A |
AVM Simulation
Time to simulate various public functions in the AVM.
| Function | time_ms | bytecode_size_in_bytes |
|---|---|---|
| FeeJuice:_increase_public_balance | 93.0 (-3%) | 8,139 |
| FeeJuice:set_portal | :warning: 14.8 (+51%) | 2,362 |
| Token:constructor | 125 (-2%) | 31,107 |
| FPC:constructor | 94.3 (-4%) | 22,380 |
| FeeJuice:mint_public | 79.2 (-1%) | 6,150 |
| Token:mint_public | 83.0 (-3%) | 11,720 |
| Token:assert_minter_and_mint | 98.3 (+8%) | 8,028 |
| AuthRegistry:set_authorized | 8.21 (+2%) | 4,537 |
| FPC:prepare_fee | 287 (+7%) | 8,812 |
| Token:transfer_public | 32.8 (-14%) | 47,374 |
| FPC:pay_refund | 56.5 (-4%) | 12,114 |
| Benchmarking:increment_balance | 1,013 (-1%) | 7,450 |
| Token:_increase_public_balance | 10.2 (-16%) | 8,960 |
| FPC:pay_refund_with_shielded_rebate | 136 (-4%) | 12,663 |
Public DB Access
Time to access various public DBs.
| Function | time_ms |
|---|---|
| get-nullifier-index | 0.166 |
Tree insertion stats
The duration to insert a fixed batch of leaves into each tree type.
| Metric | 1 leaves | 16 leaves | 64 leaves | 128 leaves | 256 leaves | 512 leaves | 1024 leaves |
|---|---|---|---|---|---|---|---|
| batch_insert_into_append_only_tree_16_depth_ms | 2.18 (+1%) | 4.00 (+5%) | N/A | N/A | N/A | N/A | N/A |
| batch_insert_into_append_only_tree_16_depth_hash_count | 16.8 | 31.7 | N/A | N/A | N/A | N/A | N/A |
| batch_insert_into_append_only_tree_16_depth_hash_ms | 0.112 (+1%) | 0.113 (+5%) | N/A | N/A | N/A | N/A | N/A |
| batch_insert_into_append_only_tree_32_depth_ms | N/A | N/A | 11.3 (-1%) | 18.2 (+3%) | 31.0 (+1%) | 61.7 (+5%) | 115 (+2%) |
| batch_insert_into_append_only_tree_32_depth_hash_count | N/A | N/A | 95.9 | 159 | 287 | 543 | 1,055 |
| batch_insert_into_append_only_tree_32_depth_hash_ms | N/A | N/A | 0.108 (-2%) | 0.105 (+2%) | 0.100 (+1%) | 0.106 (+4%) | 0.102 (+3%) |
| batch_insert_into_indexed_tree_20_depth_ms | N/A | N/A | 14.3 (+1%) | 26.4 (+4%) | 43.5 | 89.3 (+8%) | 164 (+1%) |
| batch_insert_into_indexed_tree_20_depth_hash_count | N/A | N/A | 109 | 207 | 355 | 691 | 1,363 |
| batch_insert_into_indexed_tree_20_depth_hash_ms | N/A | N/A | 0.108 | 0.107 (+4%) | 0.105 (+5%) | 0.112 (+9%) | 0.104 (+1%) |
| batch_insert_into_indexed_tree_40_depth_ms | N/A | N/A | 16.8 (+3%) | N/A | N/A | N/A | N/A |
| batch_insert_into_indexed_tree_40_depth_hash_count | N/A | N/A | 132 | N/A | N/A | N/A | N/A |
| batch_insert_into_indexed_tree_40_depth_hash_ms | N/A | N/A | 0.108 (+2%) | N/A | N/A | N/A | N/A |
Miscellaneous
Transaction sizes based on how many contract classes are registered in the tx.
| Metric | 0 registered classes | 1 registered classes |
|---|---|---|
| tx_size_in_bytes | 64,756 | 668,997 |
Transaction size based on fee payment method
| Metric | | | - | |