orc
orc copied to clipboard
ORC-998: Sharing compression output buffer among treeWriter: Refactoring within OutStream for portability
What changes were proposed in this pull request?
There's individual instance of OutStream
within each TreeWriter created by WriterContext#createStream
method. Within OutStream
, there are totally 3 buffers:
- current: the regular input buffer holding uncompressed, unencrypted bytes.
- compress: the output buffer holding compressed bytes
- overflow: same as "compress" but only used when the last compression output is larger than remaining capacity of compress buffer.
Potentially the compress and overflow buffer don't have to be allocated individually within each OutStream
object, but shared across all of them so to save memory allocation.
This PR is the first step for sharing the compression output buffer, which refactors internal of OutStream
and make the relevant object bundled together since they are logically related(and details of dealing with overflow doesn't have to be visible). This refactoring makes it easier to share the compression output buffer as a pass-in arguments in the follow-up PR.
Why are the changes needed?
For the context of ORC-997, this change makes the compression output buffer, as a single entity, easier to be shared and passed in from caller.
How was this patch tested?
There's no functional changes from this PR so passing all existed unit tests. Also added a new test in TestOutStream
to make sure that, the scenario where codec.compress()
returns false (meaning the compression output is larger than original input) is also covered in the unit test.
Thank you for making a PR, @autumnust .
LGTM, Nit: TestOutStream.java line 82 has an extra space
Gentle ping, @autumnust .
Gentle ping, @autumnust .
Is there interest in reviving this PR? Most work is done, just needs some absorb the comments. I can look into it
You can take over while keeping his authorship as the first commit in your PR, @mystic-lama .
@dongjoon-hyun Sent a PR #1633, not sure if it's the correct way. All commits are still owned by autumnust. Let me know it that's the correct way.