starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Enhancement] Add compression support for PlanFragment AttachmentRequest

Open Mesut-Doner opened this issue 3 weeks ago • 8 comments

Why I'm doing:

#63697

What I'm doing:

CREATE DATABASE IF NOT EXISTS compression_test;
USE compression_test;

CREATE TABLE IF NOT EXISTS wide_table (
    c1 VARCHAR(255),
    c2 VARCHAR(255),
    c3 VARCHAR(255),
    c4 VARCHAR(255),
    c5 VARCHAR(255),
    c6 VARCHAR(255),
    c7 VARCHAR(255),
    c8 VARCHAR(255),
    c9 VARCHAR(255),
    c10 VARCHAR(255),
    c11 VARCHAR(255),
    c12 VARCHAR(255),
    c13 VARCHAR(255),
    c14 VARCHAR(255),
    c15 VARCHAR(255),
    c16 VARCHAR(255),
    c17 VARCHAR(255),
    c18 VARCHAR(255),
    c19 VARCHAR(255),
    c20 VARCHAR(255),
    c21 VARCHAR(255),
    c22 VARCHAR(255),
    c23 VARCHAR(255),
    c24 VARCHAR(255),
    c25 VARCHAR(255),
    c26 VARCHAR(255),
    c27 VARCHAR(255),
    c28 VARCHAR(255),
    c29 VARCHAR(255),
    c30 VARCHAR(255)
) DISTRIBUTED BY HASH(c1) BUCKETS 3;

INSERT INTO wide_table VALUES 
('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','aa','bb','cc','dd'),
('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','aa','bb','cc','dd'),
('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','aa','bb','cc','dd');

SET enable_profile = true;
SET enable_async_profile = false;

LZ4 compression

ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_algorithm" = "lz4");

ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_threshold_bytes" = "5000");

ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_ratio_threshold" = "1.0");

SELECT * FROM compression_test.wide_table WHERE c1 = 'a';

SHOW PROFILELIST LIMIT 1;

ZSTD compression

ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_algorithm" = "zstd");

SELECT * FROM compression_test.wide_table WHERE c1 = 'a';

SHOW PROFILELIST LIMIT 1;

no compression ( setting threshold very high value so preventing compression)


ADMIN SET FRONTEND CONFIG ("thrift_plan_fragment_compression_threshold_bytes" = "1000000");

SELECT * FROM compression_test.wide_table WHERE c1 = 'a';

SHOW PROFILELIST LIMIT 1;

Then I took query plans from http://FE_IP:8030/query_profile?query_id=QUERY_ID

LZ4 compression applied plan: 019af503-52a9-7cbc-bc00-15060371816cprofile.txt

ZSTD compression applied plan: 019af503-db41-7018-b4bd-7f660670a19bprofile.txt

No compression applied plan(original behaviour): 019af504-3331-7629-9384-277c85176f0bprofile.txt

Here is the comparison plans:

Metrik No Compression LZ4 ZSTD İmprovement (LZ4) İmprovement (ZSTD)
DeployDataSize 14,780 bytes 3,014 bytes 1,797 bytes -79.6% -87.8%
DeployCompressTime N/A (0ms) 0ms 1ms - +1ms overhead
DeployWaitTime 0ms 1ms 3ms +1ms +3ms
Total Deploy Time 2ms 4ms 6ms +2ms +4ms
Compression Ratio 1.0x 4.9x 8.2x - -

Fixes #63697

What type of PR is this:

  • [ ] BugFix
  • [ ] Feature
  • [x] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [ ] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [ ] 4.0
    • [ ] 3.5
    • [ ] 3.4
    • [ ] 3.3

[!NOTE] Adds optional LZ4/ZSTD compression for Thrift plan fragment/short-circuit attachments with BE-side decompression and FE configs for threshold/ratio/algorithm.

  • RPC/FE:
    • Add request attachment compression in AttachmentRequest with lz4/zstd, threshold- and ratio-based enablement; propagate attachment_compression_type and uncompressed_size via PExecPlanFragmentRequest and PExecShortCircuitRequest.
    • Apply compression in send paths: BackendServiceClient.execPlanFragmentAsync(...) and ShortCircuitHybridExecutor.
  • RPC/BE:
    • Implement decompress_attachment(...) and use it in _exec_plan_fragment, _exec_batch_plan_fragments, and _exec_short_circuit to inflate attachments before Thrift deserialization.
  • Protocol:
    • Extend gensrc/proto/internal_service.proto to include attachment_compression_type and uncompressed_size in PExecPlanFragmentRequest, PExecBatchPlanFragmentsRequest, and PExecShortCircuitRequest.
  • Config (FE):
    • New tunables: thrift_plan_fragment_compression_threshold_bytes, thrift_plan_fragment_compression_ratio_threshold, thrift_plan_fragment_compression_algorithm.
  • Build:
    • Add com.github.luben:zstd-jni dependency in Gradle/Maven.

Written by Cursor Bugbot for commit 60c53849c0219987c15f7f8005df6abfabae64c4. This will update automatically on new commits. Configure here.

Mesut-Doner avatar Dec 07 '25 08:12 Mesut-Doner

🧪 CI Insights

Here's what we observed from your CI run for 60c53849.

🟢 All jobs passed!

But CI Insights is watching 👀

mergify[bot] avatar Dec 07 '25 08:12 mergify[bot]

Quality Gate Failed Quality Gate failed

Failed conditions
14.7% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

sonarqubecloud[bot] avatar Dec 07 '25 08:12 sonarqubecloud[bot]

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Dec 07 '25 08:12 github-actions[bot]

[BE Incremental Coverage Report]

:x: fail : 0 / 17 (00.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: src/service/internal_service.cpp 0 17 00.00% [101, 104, 105, 106, 107, 108, 113, 116, 117, 118, 119, 369, 370, 374, 484, 485, 486]

github-actions[bot] avatar Dec 07 '25 08:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 07 '25 18:12 alvin-celerdata

@Mesut-Doner,

  1. the enable/disable switch should be a session variable rather than a fe.conf setting.
  2. the variable names can start with plan_fragment_compression
  3. and we have already had TRANSMISSION_COMPRESSION_TYPE, we can keep naming consistent, as plan_fragment_compression_type

alvin-celerdata avatar Dec 07 '25 18:12 alvin-celerdata


DeployCompressTime | N/A (0ms) | 0ms | 1ms | - | +1ms overhead
-- | -- | -- | -- | -- | --
DeployWaitTime | 0ms | 1ms | 3ms | +1ms | +3ms
Total Deploy Time | 2ms | 4ms | 6ms | +2ms | +4ms
Compression Ratio | 1.0x | 4.9x | 8.2x | - | -


from your test result, we should disable this feature by default.

kangkaisen avatar Dec 11 '25 05:12 kangkaisen

@Mesut-Doner I would suggest testing a more extensive query, such as one involving a table with numerous partitions or tablets. This way, the query plan is more likely to exceed 100KB.

murphyatwork avatar Dec 11 '25 06:12 murphyatwork