jellyfish
jellyfish copied to clipboard
ADVZ constructor take erasure_code_rate instead of payload_chunk_size
Blocking #393 ?
Current constructor: https://github.com/EspressoSystems/jellyfish/blob/8d071e2df74e5093d2b3c35c32773b09370c2452/primitives/src/vid/advz.rs#L98-L102
payload_chunk_size
= degree of polynomials. One could argue this is an implementation detail that should not be exposed to the user. Indeed, in the future we might wish to keep control over this argument for ourselves. Example: the optimization of #393 seems to require us to control this argument.
Proposal
Replace payload_chunk_size
with erasure_code_rate
. How best to specify it? Options:
-
usize
: set ton
means the rate is1/n
. (So ifn
is 3 thennum_storage_nodes
is 3x polynomial degree.) This captures most common choices but it does limit use to rates of the form1/n
- arbitrary rational number: captures all plausible choices but adds complexity. eg. How to specify the rational? Do we take 2 args (numerator & denominator)? Use an external crate such as num_rational - Rust? (ick.) This is probably overkill. I suspect restricting to 1/n rates is fine.
Let us sync with consensus before we move forward with this task.
Took the first steps here: https://github.com/EspressoSystems/jellyfish/pull/533
The PR draft replaces payload_chunk_size
with erasure_code_rate
and adapts test.
The approach is described above; the real rate is the reciprocal of erasure_code_rate
(usize: n means that r = 1/n).
Still unsure about the value of this change in itself (seems to be just another way of inputting the same parameters). Also, it requires changes downstream (Hotshot).
We could consider (here or in another issue) computing the multiplicity inside the initializer (instead of taking this as a parameter). Seems like we should be able to identify the right multiplicity (incl. distinguishing between GPU/CPU).
@ggutoski: You are welcome to continue working on this PR/branch or merely use it for inspiration.
Change of plans.
tldr;
We will not take a erasure_code_rate
arg. Instead we'll make only a minor tweak.
New proposal
Offer two constructors:
-
new
simple/generic: any VID scheme should takenum_storage_nodes
andnum_shares_needed_for_recovery
(I suggest we call itrecovery_threshold
). We choosemultiplicity
for you. -
with_multiplicity
complex/implementation-specific: allow caller to specifymultiplicity
.
Side request from @mrain : remove the existing limit on multiplicity
to better facilitate GPU.
Explanation
Currently payload_chunk_size
must be a power of 2 (until we get KZG in eval form). The original proposal is to compute payload_chunk_size = num_storage_nodes / erasure_code_rate
, which won't always give a power of 2. The easiest solution is to round that quotient to a power of 2 until this restriction is lifted in the future. But there's a problem: the user won't (easily) know how many shares are needed for recovery!
This problem highlights the difficulty with a constructor based on erasure_code_rate
. Starting to think VID scheme should take num_storage_nodes
, num_shares_needed_for_recovery
(aka payload_chunk_size
, recovery_threshold
). No objections in zulip discussion.