starrocks
starrocks copied to clipboard
[Enhancement] Optimize ingestion performance for table with materialized index STEP 1
What type of PR is this:
- [ ] bug
- [ ] feature
- [x] enhancement
- [ ] refactor
- [ ] others
Which issues of this PR fixes :
Fixes #7717
Problem Summary(Required) :
When a table has multiple materialized views, StarRocks will send the chunks to the tablets corresponding to the materialized views, which leads to the enlargement of network transmission bandwidth and unnecessary data serialization and transmission overhead. Since the materialized view and the base table use the same partition column and distribute in the same node when using colocate mv #7451, we can reuse the chunk data of the base table.
| materialized index num | main branch | optimize branch | SpeedUp |
|---|---|---|---|
| 0 | 1314MB/s | 1310MB/s | |
| 1 | 780MB/s | 1274Mb/s | 1.6x |
| 3 | 420MB/s | 1241MB/s | 2.9x |
| 5 | 286MB/s | 1213MB/s | 4.2x |
using github_events data, 16 parallel 3 BE with 1.5GB/s network bandwidth create two columns of materialized view; materialized view computation is relatively small; ensure the bottleneck is network bandwidth and serialization.
NOTE We will enable it after colocate mv index implementation.
[FE PR Coverage check]
:heart_eyes: pass : 0 / 0 (0%)
run starrocks_be_unittest
run starrocks_be_unittest
run starrocks_fe_unittest
@mergify rebase
rebase
✅ Branch has been successfully rebased
run starrocks_fe_unittest
run starrocks_fe_unittest
@mergify rebase
rebase
✅ Branch has been successfully rebased
LGTM for the LoadChannel and TabletsChannel parts.
run starrocks_admit_test
run starrocks_admit_test
@mergify rebase
rebase
✅ Branch has been successfully rebased
run starrocks_admit_test
run starrocks_fe_unittest
run starrocks_admit_test
run starrocks_fe_unittest
run starrocks_fe_unittest
run starrocks_fe_unittest
run starrocks_fe_unittest
[FE PR Coverage Check]
:heart_eyes: pass : 0 / 0 (0%)