zebra
zebra copied to clipboard
Move CPU-heavy proof preparation into the batch cryptography thread
Motivation
Zebra's sync can be slow or fail, because proof batch preparation involves some CPU-heavy cryptography.
So we want to move proof preparation into the existing batch rayon
CPU thread pool scope
.
API Reference
Blocking threads: https://docs.rs/tokio/latest/tokio/task/fn.spawn_blocking.html
CPU-heavy threads: https://docs.rs/rayon/latest/rayon/fn.scope_fifo.html
Detailed Analysis
See https://github.com/ZcashFoundation/zebra/issues/4583#issuecomment-1186662862
Designs
When queueing batch items, store them raw, then do preparation inside the rayon
scope, for the following verifiers:
- groth16
- halo2
Existing code for rayon
in tokio
blocking threads:
https://github.com/ZcashFoundation/zebra/blob/81727d7df0ee57e6d4f2172c287e52e414c4e6a0/zebra-consensus/src/primitives/ed25519.rs#L121-L134
Related Work
Note that we wanted to move proof decoding inside tx deserialization: https://github.com/ZcashFoundation/zebra/issues/3179
I don't think this changes anything since the proposed approach in this ticket is probably easier, but I thought it was worth it pointing out (i.e. another solution would be to do #3179, and move the entire deserialization to a thread, which I think we'll do anyway in another ticket).
If we follow the proposed approach in this ticket then we should probably close #3179
Good point @conradoplg. I can take this ticket, but if we move the proof preparation to rayon
, shouldn't we still keep in mind that we should use bellman::Proof
in Groth16Proof
instead of a byte array?
Yep we need to decide which approach to use:
- Leave the byte array, and decode inside the batch verifier, as suggested in the ticket
- Replace the byte array, such that the decoding happens when deserializing, and then https://github.com/ZcashFoundation/zebra/issues/4787 will automatically solve this issue. The downside is that replacing the array requires a lot of changes (not that much, but it seems it will be a bit annoying) as described in #3179 so we may not want to do this right now.
(Thinking about it, if we do the first, we can keep #3179 open as long as we add a note to it to remember to revert the decoding inside the batch verifier)
I think I'd lean towards the second option in terms of future maintenance since it seems a bit cleaner.
I'd be interested in comparing the performance differences between the two different options.
I think at the moment we're looking for a quick fix, so we can re-test Zebra sync, and find any bugs or excessive CPU usage.
I'd be happy to change https://github.com/zcash-hackworks/zcash-test-vectors/ to generate structurally valid (but still unverifiable) Groth16 proofs, which seems to be one of the main blockers for #3179. I won't have time to do this before Zcon, though.
Right now, we need to do the quick fix of moving proof decoding into the batch rayon
threads.
After Zebra is syncing correctly, we can explore cleanup/simplification tasks like #3179.
We're also seeing stalls during checkpoint verification, which doesn't use the batch verifier code. So we might want to review the priority of this ticket.
We could check how much time proof preparation is actually taking? I'll try to open a PR with some easy-to-use functions today.
Conrado did some testing, and proof preparation takes 1-25ms.
So we don't need to do this ticket right now, but we should do it eventually, to handle large aggregated Orchard proofs.
@upbqdn how much progress have you made on this ticket so far? It's ok to finish it off and submit a PR if you want.
@upbqdn how much progress have you made on this ticket so far?
I have no implementation ready to be pushed yet.
@upbqdn how much progress have you made on this ticket so far?
I have no implementation ready to be pushed yet.
Feel free to finish it off, or to leave it for now, and move on to something that's more urgent.
It looks like no-one is working on this right now, and we've finished the epic it was a part of.
Performance improvements are not a priority for Zebra right now.