remote-apis icon indicating copy to clipboard operation
remote-apis copied to clipboard

Add support for chunking of blobs, using SHA256TREE

Open EdSchouten opened this issue 2 years ago • 2 comments

Buildbarn has invested heavily in using virtual file systems. Both on the worker and client side it's possible to lazily fault in data from the CAS. As Buildbarn implements checksum verification where needed, randomly accessing large files may be slow. To address this, this change adds support for composing and decomposing CAS objects, using newly added ConcatenateBlobs() and SplitBlobs() operations.

If implemented naively (e.g., using SHA-256), these operations would not be verifiable. To rephrase: when merely given the checksum of smaller objects, there is no way to obtain that of its concatenated version. This is why we suggest that these operations are only used in combination with SHA256TREE (see #235).

With these new operations present, there is no true need to use the Bytestream protocol any longer. Writes can be performed by uploading smaller parts through BatchUpdateBlobs(), followed by calling ConcatenateBlobs(). Conversely, reads of large objects can be performed by calling SplitBlobs() and downloading individual parts through BatchReadBlobs(). For compatibility, we still permit the Bytestream protocol to be used. This is a decision we can revisit in REv3.

Fixes: #178

EdSchouten avatar Nov 06 '22 22:11 EdSchouten

As #235 and #236 are in my opinion close to a state in which they can be merged, I have gone ahead and reimplemented this PR on top of #235. Changes to the previous version are as follows:

  • As suggested by @EricBurnett + @bergsieker, I have moved the {Concatenate,Split}Blobs() capabilities into a separate message. In theory you could use it in combination with any digest function. The downside of using anything other than SHA256TREE is obviously that client/server-side validation of these requests is either impossible or prohibitively expensive.
  • Related to the above, {Concatenate,Split}Blobs() now take/return hashes of small objects, instead of BLAKE3-style chaining values.
  • FindMissingBlobsRequest now also has a new split_sizes_bytes field. This permits clients to more efficiently increase the lifetime of objects returned by SplitBlobs().

PTAL, ignoring the first commit in this PR. That one is part of #235.

EdSchouten avatar Dec 26 '22 17:12 EdSchouten

I think this idea is neat! I'm excited about combining this with the idea of transmitting variable size chunks.

I'd like to propose a slight modification: instead of repeated string small_hashes = 2, if we repeated Digest messages, we could allow for multiple digests of different sizes to be joined into a contiguous chunk. It's also a more natural fit -- small chunks that have been uploaded to the CAS already return a digest.

Separately, what do you think about returning a ConcatenateBlobsResponse from the ConcatenateBlobs call? Though the response matches BatchUpdateBlobsResponse today, it may not in the future, and it's much easier to change this now before clients are using it.

Finally, would it make sense to remove the chunk size in SplitBlobsRequest and let the server decide this value?

tylerwilliams avatar Feb 09 '23 00:02 tylerwilliams