incubator-pegasus icon indicating copy to clipboard operation
incubator-pegasus copied to clipboard

feat: improve performance of count_data

Open GehaFearless opened this issue 3 years ago • 1 comments

What problem does this PR solve?

issue: https://github.com/apache/incubator-pegasus/issues/1090 same job as before: https://github.com/apache/incubator-pegasus/pull/728

When we precisely count data for a large table, it will cost minutes or hours. However, it's unnecessarily return key-values from server to client.

What is changed and how does it work?

Actually, we just need the count of data. So we just need transfer the count of data from server to client, but not the detailed data. When we need it, we can input "count_data -c -o" on pegasus_shell. In my test, it will 2x on onebox faster than before.

Tests
  • Unit test
  • Manual test (add detailed scripts or steps below)
Related changes
  • Need to update the documentation
  • Need to be included in the release note

GehaFearless avatar Aug 01 '22 08:08 GehaFearless

Could you give some performance comparation with the old version ?

acelyc111 avatar Aug 12 '22 07:08 acelyc111