tensorstore icon indicating copy to clipboard operation
tensorstore copied to clipboard

Support of large files for grpc_kvstore

Open zlobober opened this issue 1 year ago • 1 comments

This issue is related to grpc_kvstore driver, which is currently not available for using with vanilla tensorstore, but was discussed in the issue https://github.com/google/tensorstore/pull/134 and is going to become public.

ML teams in my company are using a privately patched version of tensorstore with grpc_kvstore driver, whose backend is implemented by YTsaurus. Current grpc protocol imposes the following problem: both Read and Write requests are one-shot, which limits the length of a value by 2 GiB, which is a fundamental upper limit of Protobuf message size. Also, it is not a good idea in general to write read or write large blobs within a single request, because it is not a fault-tolerant solution.

Our proposal is to make Read and Write requests respectively server-side and client-side streaming methods, limiting the size of a single message with something reasonable like 32 MiB. It seems that this change may be done in a backward-incompatible manner, as grpc_kvstore interface is not public and stable yet.

If you are ok with this proposal, we would be glad to bring a PR implemeting this idea.

zlobober avatar Feb 16 '24 14:02 zlobober

I concur that the read and write need to be streaming gRPC apis; the gcs_grpc driver is likely a reasonable model for the kvstore driver code.

Note that default gRPC server GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH is 4MB; the chunk size should be configurable.

laramiel avatar Feb 17 '24 23:02 laramiel