[Enhancement] Add compression support for PlanFragment AttachmentRequest
Why I'm doing:
#63697
What I'm doing:
CREATE DATABASE IF NOT EXISTS compression_test;
USE compression_test;
CREATE TABLE IF NOT EXISTS wide_table (
c1 VARCHAR(255),
c2 VARCHAR(255),
c3 VARCHAR(255),
c4 VARCHAR(255),
c5 VARCHAR(255),
c6 VARCHAR(255),
c7 VARCHAR(255),
c8 VARCHAR(255),
c9 VARCHAR(255),
c10 VARCHAR(255),
c11 VARCHAR(255),
c12 VARCHAR(255),
c13 VARCHAR(255),
c14 VARCHAR(255),
c15 VARCHAR(255),
c16 VARCHAR(255),
c17 VARCHAR(255),
c18 VARCHAR(255),
c19 VARCHAR(255),
c20 VARCHAR(255),
c21 VARCHAR(255),
c22 VARCHAR(255),
c23 VARCHAR(255),
c24 VARCHAR(255),
c25 VARCHAR(255),
c26 VARCHAR(255),
c27 VARCHAR(255),
c28 VARCHAR(255),
c29 VARCHAR(255),
c30 VARCHAR(255)
) DISTRIBUTED BY HASH(c1) BUCKETS 3;
INSERT INTO wide_table VALUES
('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','aa','bb','cc','dd'),
('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','aa','bb','cc','dd'),
('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','aa','bb','cc','dd');
SET enable_profile = true;
SET enable_async_profile = false;
LZ4 compression
ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_algorithm" = "lz4");
ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_threshold_bytes" = "5000");
ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_ratio_threshold" = "1.0");
SELECT * FROM compression_test.wide_table WHERE c1 = 'a';
SHOW PROFILELIST LIMIT 1;
ZSTD compression
ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_algorithm" = "zstd");
SELECT * FROM compression_test.wide_table WHERE c1 = 'a';
SHOW PROFILELIST LIMIT 1;
no compression ( setting threshold very high value so preventing compression)
ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_threshold_bytes" = "1000000");
SELECT * FROM compression_test.wide_table WHERE c1 = 'a';
SHOW PROFILELIST LIMIT 1;
Then I took query plans from http://FE_IP:8030/query_profile?query_id=QUERY_ID
LZ4 compression applied plan: 019af503-52a9-7cbc-bc00-15060371816cprofile.txt
ZSTD compression applied plan: 019af503-db41-7018-b4bd-7f660670a19bprofile.txt
No compression applied plan(original behaviour): 019af504-3331-7629-9384-277c85176f0bprofile.txt
Here is the comparison plans:
| Metrik | No Compression | LZ4 | ZSTD | İmprovement (LZ4) | İmprovement (ZSTD) |
|---|---|---|---|---|---|
| DeployDataSize | 14,780 bytes | 3,014 bytes | 1,797 bytes | -79.6% | -87.8% |
| DeployCompressTime | N/A (0ms) | 0ms | 1ms | - | +1ms overhead |
| DeployWaitTime | 0ms | 1ms | 3ms | +1ms | +3ms |
| Total Deploy Time | 2ms | 4ms | 6ms | +2ms | +4ms |
| Compression Ratio | 1.0x | 4.9x | 8.2x | - | - |
Fixes #63697
What type of PR is this:
- [ ] BugFix
- [ ] Feature
- [x] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [ ] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [ ] 4.0
- [ ] 3.5
- [ ] 3.4
- [ ] 3.3
[!NOTE] Adds optional LZ4/ZSTD compression for Thrift plan fragment/short-circuit attachments with BE-side decompression and FE configs for threshold/ratio/algorithm.
- RPC/FE:
- Add request attachment compression in
AttachmentRequestwithlz4/zstd, threshold- and ratio-based enablement; propagateattachment_compression_typeanduncompressed_sizeviaPExecPlanFragmentRequestandPExecShortCircuitRequest.- Apply compression in send paths:
BackendServiceClient.execPlanFragmentAsync(...)andShortCircuitHybridExecutor.- RPC/BE:
- Implement
decompress_attachment(...)and use it in_exec_plan_fragment,_exec_batch_plan_fragments, and_exec_short_circuitto inflate attachments before Thrift deserialization.- Protocol:
- Extend
gensrc/proto/internal_service.prototo includeattachment_compression_typeanduncompressed_sizeinPExecPlanFragmentRequest,PExecBatchPlanFragmentsRequest, andPExecShortCircuitRequest.- Config (FE):
- New tunables:
thrift_plan_fragment_compression_threshold_bytes,thrift_plan_fragment_compression_ratio_threshold,thrift_plan_fragment_compression_algorithm.- Build:
- Add
com.github.luben:zstd-jnidependency in Gradle/Maven.Written by Cursor Bugbot for commit 60c53849c0219987c15f7f8005df6abfabae64c4. This will update automatically on new commits. Configure here.
🧪 CI Insights
Here's what we observed from your CI run for 60c53849.
🟢 All jobs passed!
But CI Insights is watching 👀
[Java-Extensions Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[BE Incremental Coverage Report]
:x: fail : 0 / 17 (00.00%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | src/service/internal_service.cpp | 0 | 17 | 00.00% | [101, 104, 105, 106, 107, 108, 113, 116, 117, 118, 119, 369, 370, 374, 484, 485, 486] |
@cursor review
@Mesut-Doner,
- the enable/disable switch should be a session variable rather than a fe.conf setting.
- the variable names can start with plan_fragment_compression
- and we have already had TRANSMISSION_COMPRESSION_TYPE, we can keep naming consistent, as plan_fragment_compression_type
DeployCompressTime | N/A (0ms) | 0ms | 1ms | - | +1ms overhead
-- | -- | -- | -- | -- | --
DeployWaitTime | 0ms | 1ms | 3ms | +1ms | +3ms
Total Deploy Time | 2ms | 4ms | 6ms | +2ms | +4ms
Compression Ratio | 1.0x | 4.9x | 8.2x | - | -
from your test result, we should disable this feature by default.
@Mesut-Doner I would suggest testing a more extensive query, such as one involving a table with numerous partitions or tablets. This way, the query plan is more likely to exceed 100KB.
