bulletproofs
bulletproofs copied to clipboard
Batch verification for range proofs
Overview
This adds batch verification to any number of range-proofs. The API allows mixing proofs with different sizes (in terms of both range size n
and aggregation size m
), and checks that the generators provided are enough to cover max(n*m)
.
Addresses #22 and replaces older PR #27.
Performance
Performance depends on Pippenger implementation (not merged yet): https://github.com/dalek-cryptography/curve25519-dalek/pull/129
This PR does not have benchmarks for the batched verification yet.
The existing verify
method is not implemented via generalized batch verification to avoid code duplication. This should add negligible overhead, but I haven't run the benchmarks since previous version #27 (where I did not find any statistically meaningful difference).
Instead of moving to a new verification
module, I think it would be better to keep all of this as part of the rangeproof.
Currently, the Verification
struct holds a bunch of temporary data. I'm wondering whether we can get away with not holding any temporaries, and have the verification struct act as a view into the proof data, computing things on the fly.
Related to the above is something we delayed thinking about earlier: whether the proof data should have compressed points, so that we don't have to decompress during deser and then recompress to hash during verification. This could impact whether we have to allocate temporaries, but I don't think we made a ticket for this.
Related to the above is something we delayed thinking about earlier: whether the proof data should have compressed points, so that we don't have to decompress during deser and then recompress to hash during verification. This could impact whether we have to allocate temporaries, but I don't think we made a ticket for this.
@hdevalence Does it depend on upstream support for deserializing compressed points? Do we need to get creative and have DecompressedRistretto
type that holds both compressed bytes (for fast compression) and the decompressed point? So that we can verify point validity during deserialization and also avoid extra work?
We could just use compressed points in the struct, and call .map(|p_bytes| p.decompress())
, with some extra iterator combinators to handle the None
case.
I rewrote the verification so that API does not expose the helper object with temporaries.
Just wondering, what's the motivation for making the batch verification API mix all kinds of proofs in a single batch, instead of requiring that the batch contains only the same type of proof?
I'm worried that there's additional complexity and room for mistakes (both in the implementation and in the code that uses it) by mixing proof types, and I'm also not sure what kind of use-case that would be important for. Even supposing you had a protocol with different kinds of proofs, you could alternately create a batch of proofs of type 1, a batch of proofs of type 2, etc.
The code looks good, but I am also wondering the same as above ^ - if we don't mix the different proof sizes, it seems like a some of the harder-to-follow logic can be removed (eg figuring out the max generator sizes, the split of n*m
into n*m, 1
, padding proofs).
I think the mix of differently-sized (m
) aggregated proofs is very realistic, while a mix of differently-sized ranges (n
) is very unlikely.
Why varying m
is very likely: bulletproofs create an incentive to aggregate proofs both for better privacy (coinjoin) and minimized cost. The process is interactive and we can expect all sorts of differently sized aggregations in the wild. Since all we care is n*m
, supporting varying n
s comes at no cost if we support varying m
s. Batching all proofs in one giant expression instead of a grouping them by smaller ones should give non-trivial performance advantage since Pippenger scales sub-linearly and something hypothetical like "verification on a GPU" wants large amount of work and has higher-than-trivial latency to set up (per proof).
That said, I think the PR wouldn't be noticeably simpler if we fixed m
and n
. We'd still have to collect verifications scalars in vecs, and verification objects in one single vector because m
wouldn't be statically defined anyway and we need to iterate over per-proof data several times: (1) combining scalars on static bases and (2,3) concatenating dynamic bases and scalars.
Re: n*m,1
- maybe we should simplify the generators API to take a single usize
which will be filled in by n*m
in the RP protocol?
The point about varying aggregation sizes (m
) is convincing, and that seems like something we should support in the future.
But as things are now, I think it actually violates the contract of the Generators
struct, which contains generators for aggregated range proofs of size (m,n)
. That API isn't designed for varying m
, and I think that the awkwardness about doing ad-hoc m*n
s etc., is really a symptom of trying to use the Generators
in a way that it wasn't designed for. If we would like to support varying m
, I think we should redesign the Generators
API around the idea that m
can vary, and then update the rest of the code accordingly.
Hi @oleganza what's the status of this PR? One suggestion is to have two apis one for same 'm' and one for a variable one.