y-octo icon indicating copy to clipboard operation
y-octo copied to clipboard

add binary format that supports partial reading and self-verification as a storage format

Open darkskygit opened this issue 2 years ago • 2 comments
trafficstars

ybinary v1 is a binary format optimized for one-time network transmission.

It only supports overall reading and cannot know whether binary is damaged before the reading process goes wrong.

For specific analysis, please refer to this review:

https://github.com/toeverything/OctoBase/issues/383#issuecomment-1513577058

We need to design a binary format that supports partial reading and self-verification to store crdt state permanently and robustly

darkskygit avatar Aug 23 '23 08:08 darkskygit

From the advice from @dmonad, we can store the checksum info in the y-binary itself.

Brooooooklyn avatar Aug 30 '23 14:08 Brooooooklyn

You can create a new (custom) binary "v1-with-checksum" by concatenating the checksum and the binary update. E.g.

doc.on('update', update => {
  const v1UpdateWithChecksum = encoding.encode(encoder => {
     encoding.writeUint8(encoder, ChecksumType)
     encoding.writeVarUint8Array(encoder, checksum(update))
     encoding.writeVarUint8Array(encoder, update)
  })
})

I imagine that most users don't want to verify each single update and re-request the data from another source if the update is manipulated. So maybe you store an error-correcting CRC checksum instead of something like sha or rabin.

dmonad avatar Aug 30 '23 17:08 dmonad