ORC-1711: [C++] Introduce a memory block size parameter for writer option
What changes were proposed in this pull request?
- Add the memory block size parameter to the writer option, which initializing the compressed input buffer block size
- The compressed stream will retain the input buffer until the input buffer size reaches the compression block size, allowing the compressed stream to start with a minimal initial memory footprint.
Why are the changes needed?
This code segment distinguishes between the compression block size and the input buffer size to solve the issue.
How was this patch tested?
The UTs in TestCompression.cc and TestWriter.cc can cover this patch.
@luffy-zh Thank you for making the PR. Could you explain why we need
outputStream->getRawInputBufferSize()to record position? it seems thatflushedSizeandbufferPositionwere't changed.
When we call recordPosition(), we need to record the output buffer length and input buffer length. We have multiple blocks in the input buffer, 'bufferPosition' only records the effective length of the last block, that's why we need to use getRawInputBufferSize() in the outputStream.
@luffy-zh Thank you for making the PR. Could you explain why we need
outputStream->getRawInputBufferSize()to record position? it seems thatflushedSizeandbufferPositionwere't changed.When we call recordPosition(), we need to record the output buffer length and input buffer length. We have multiple blocks in the input buffer, 'bufferPosition' only records the effective length of the last block, that's why we need to use getRawInputBufferSize() in the outputStream.
Make sense.
@luffy-zh Thank you.