greptimedb
greptimedb copied to clipboard
feat: add a size limit on a record batch
What problem does the new feature solve?
Currently, there is no size limit set on a record batch, which has become problematic due to the constraints imposed by the Tonic crate on message sizes for both receiving and sending. Tonic employs a default size limit of 4MB for received messages, defined as the DEFAULT_MAX_RECV_MESSAGE_SIZE
constant, and a default maximum send message size of 2^64 - 1
bytes, defined as the DEFAULT_MAX_SEND_MESSAGE_SIZE
constant.
To address this issue, it's crucial to set a size limit on a record batch to prevent it from exceeding 4MB.
What does the feature do?
Introduce a size limit on a record batch.
Implementation challenges
- Set a default maximum size for a record batch, which could be either a fixed constant or a configurable option.
- Utilize this size limit across the codebase to prevent any operations that could lead to the expansion of a record batch beyond the established threshold.
- Develop a set of unit tests to verify the functionality of the size limit.
Since the unit of transmission is a flight data which may contain at least one record batch, the size limit shall be much more conservative than the default maximum size set by Tonic.
I'm interested in this feature.
Firstly, I have a question to ask, is this feature for server security by limiting the grpc request/response size?
If that's the case. I can see 2 ways to achieve this:
- Limit http2 frame size (transport level)
- Limit grpc service encoding/decoding message size (service level)
@MichaelScofield PTAL
@TheWaWaR Thx for your interest. It's not about security. It's a configuration on request/response's size limit in Tonic that should be considered.
I think the service level limitation is enough.
By searching the codebase I find max_decoding_message_size
already applied to grpc clients. So if I understand correctly the rest tasks should include:
- Make grpc client
max_encoding_message_size
configurable - Make grpc server
max_decoding_message_size
/max_encoding_message_size
configurable - Add tests for both client/server to prove the configuration works
@TheWaWaR You are right.
@TheWaWaR thx, I've merged your PR. However, to close this issue, I think it's important to set a limit on RecordBatches, too. The RecordBatches are popped from DataFusion, so it requires additional efforts to investigate how to restrict their size. Are you willing to continue to do that?
@MichaelScofield
Are you willing to continue to do that?
Yes, why not.
So, I have some new questions:
First, how to define the size of a RecordBatches
? There is no Serialize, Deserialize
implementation for RecordBatches
, but HttpRecordsOutput
have Serialize, Deserialize
implementation. And serialization can have multiple target formats such as json/msgpack/bincode. So the first question should be proper answered.
Second, what's the intension to limit the size of a RecordBatches
? Wherever RecordBatches
serialized/deserialized, currently it only transferred use grpc, why limit the grpc message size is not enough? Are there other places RecordBatches
used and the size should be limited (other than grpc)? Or let me ask the other way: what are the cases when RecordBatches
size not limited become a problem?
@TheWaWaR sorry for the late reply. The size of a RecordBatch is indeed hard to estimate. The best guess might be sum up its vectors' memory_size
.
As to the second question, now our grpc interface does have the size limit. However, what if we feed a huge RecordBatch to the grpc interface, will it segment the input inside to fit its size limit? A quick googling suggests that Tonic does not have this type of feature. So I guess it would be a problem if the underlying query engine pops a huge RecordBatch that is larger than the Tonic's size limit. It would be simply failed.
That said, limiting the size of RecordBatch (or segment it) is hard. I think a more proper place to do it is in FlightRecordBatchStream
. When the encoded flight data is over the grpc size limit, segment it. This way we avoid calculating the size of RecordBatch. And since FlightData
is a plain protobuf message, segmenting it should be easier.