chiapos icon indicating copy to clipboard operation
chiapos copied to clipboard

PROPOSAL: Reorder Deltas and Stubs to reduce Disk I/O

Open mgraczyk opened this issue 4 years ago • 2 comments

Currently, the plot file format serializes line point parks as a sequence of stubs, followed by encoded deltas.

[size, stub0, stub1, ..., deltas_size, deltas, padding]

Optimized provers can read each a line point in the park with a single ~8500 byte read starting at the beginning of the park. However, only half of the stubs are actually used on average.

It is possible to read only the necessary stubs and deltas using two disk reads, the first of ~3700 bytes on average and the second of 800-1000, depending on the table. However, this requires two reads instead of one and increases proving latency.

If the park format were reordered to include the variable-sized deltas before the fixed-size stubs, then the prover could perform one ~4700 byte read on average, per line point.

[size, deltas_size, deltas..., stub0, stub1, ..., padding]

This works because the stubs have a known size, and the position is known when reading the park from disk.

This reduces proving latency, especially on slow disks or managed storage services.

mgraczyk avatar Jun 17 '21 18:06 mgraczyk

'This issue has been flagged as stale as there has been no activity on it in 14 days. If this issue is still affecting you and in need of review, please update it to keep it open.'

github-actions[bot] avatar Aug 12 '21 11:08 github-actions[bot]

This is a great idea! Changing the plot format is not easy due to backwards compatibility, but maybe we can get this in for the next version of the plot format.

mariano54 avatar Jan 24 '22 18:01 mariano54

We do have a new plot format coming out, but this particular thing is not applicable to that format, so I'll close this for now.

wallentx avatar Dec 01 '22 17:12 wallentx