kafka-protocol-rs icon indicating copy to clipboard operation
kafka-protocol-rs copied to clipboard

Improving efficiency of encoding ProduceRequests

Open eblocha opened this issue 10 months ago • 5 comments

Hello!

I'm wondering if there's any way to avoid a double-copy I'm having to do when encoding produce requests. Currently I am creating a new BytesMut to pass to RecordBatchEncoder::encode, then passing that into PartitionProduceData. At the tcp sink, those bytes then get encoded again into the sink's internal buffer.

let mut records = BytesMut::new();

RecordBatchEncoder::encode(
    &mut records,
    prepared_records.iter(), // Vec<Record>
    ...
);

// error handling...

// this gets added to the ProduceRequest
let partition_data = PartitionProduceData::default()
    .with_index(partition)
    .with_records(Some(records.into()));

However, when this request is going to be sent to the IO sink, it gets encoded into a FramedWrite's internal buffer (from tokio_util). After encoding, the BytesMut I made is dropped, and I need to allocate a new one.

I'm wondering if there'd be a way to avoid this initial encode step into the intermediate byte array in PartitionProduceData, and do RecordBatchEncoder::encode on the buffer for FramedWrite, skipping the extra allocation.

eblocha avatar Jan 28 '25 23:01 eblocha