doris
doris copied to clipboard
[Enhencement](schema_scanner) Optimize the performance of reading information schema tables
Proposed changes
Issue Number: close #xxx
Problem summary
- batch fill block
- batch call rpc from FE to get table desc
For 34w colunms
SELECT COUNT( * ) FROM information_schema.columns;
time: 10.3s --> 0.4s
Checklist(Required)
- [ ] Does it affect the original behavior
- [ ] Has unit tests been added
- [ ] Has document been added or modified
- [ ] Does it need to update dependencies
- [ ] Is this PR support rollback (If NO, please explain WHY)
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...
run buildall
TeamCity pipeline, clickbench performance test result: the sum of best hot time: 33.63 seconds stream load tsv: 470 seconds loaded 74807831229 Bytes, about 151 MB/s stream load json: 37 seconds loaded 2358488459 Bytes, about 60 MB/s stream load orc: 67 seconds loaded 1101869774 Bytes, about 15 MB/s stream load parquet: 28 seconds loaded 861443392 Bytes, about 29 MB/s https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230304174752_clickbench_pr_108538.html
./run p0
./run buildall
PR approved by at least one committer and no changes requested.
PR approved by anyone and no changes requested.
./run p0
run p0
./run buildall
run p0
./run buildall