arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[C++][Parquet] Thrift: generate template method to accelerate reading thrift

Open mapleFU opened this issue 1 year ago • 1 comments

Describe the enhancement requested

thrift cpp idl enable generate template for thrift proto.

Pro: This avoid lots of virtual function calls during deserializing. Cons: more generated methods

Component(s)

C++, Parquet

mapleFU avatar May 17 '24 06:05 mapleFU

After:

BM_ReadOffsetIndex/num_pages:8                          669 ns          657 ns      1065579 bytes_per_second=135.025M/s items_per_second=12.1793M/s
BM_ReadOffsetIndex/num_pages:64                        2898 ns         2821 ns       248112 bytes_per_second=258.966M/s items_per_second=22.6879M/s
BM_ReadOffsetIndex/num_pages:512                      19916 ns        19726 ns        35852 bytes_per_second=316.668M/s items_per_second=25.9557M/s
BM_ReadOffsetIndex/num_pages:1024                     38858 ns        38746 ns        17122 bytes_per_second=325.143M/s items_per_second=26.4285M/s
BM_ReadColumnIndex<Int64Type>/num_pages:8              1053 ns         1035 ns       682301 bytes_per_second=157.502M/s items_per_second=7.72643M/s
BM_ReadColumnIndex<Int64Type>/num_pages:64             3746 ns         3729 ns       185681 bytes_per_second=331.2M/s items_per_second=17.1633M/s
BM_ReadColumnIndex<Int64Type>/num_pages:512           25504 ns        23972 ns        29741 bytes_per_second=408.125M/s items_per_second=21.3579M/s
BM_ReadColumnIndex<Int64Type>/num_pages:1024          46253 ns        46073 ns        15157 bytes_per_second=424.313M/s items_per_second=22.2256M/s
BM_ReadColumnIndex<DoubleType>/num_pages:8             1048 ns         1037 ns       676113 bytes_per_second=157.265M/s items_per_second=7.71483M/s
BM_ReadColumnIndex<DoubleType>/num_pages:64            4462 ns         3900 ns       177688 bytes_per_second=316.707M/s items_per_second=16.4122M/s
BM_ReadColumnIndex<DoubleType>/num_pages:512          24274 ns        23614 ns        29629 bytes_per_second=414.318M/s items_per_second=21.682M/s
BM_ReadColumnIndex<DoubleType>/num_pages:1024         46635 ns        46293 ns        15047 bytes_per_second=422.297M/s items_per_second=22.12M/s
BM_ReadColumnIndex<FLBAType>/num_pages:8               1060 ns         1055 ns       658359 bytes_per_second=154.538M/s items_per_second=7.58105M/s
BM_ReadColumnIndex<FLBAType>/num_pages:64              4200 ns         3860 ns       182470 bytes_per_second=319.921M/s items_per_second=16.5788M/s
BM_ReadColumnIndex<FLBAType>/num_pages:512            23811 ns        23545 ns        28779 bytes_per_second=415.526M/s items_per_second=21.7452M/s
BM_ReadColumnIndex<FLBAType>/num_pages:1024           45753 ns        45554 ns        15271 bytes_per_second=429.147M/s items_per_second=22.4788M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:8           983 ns          976 ns       708366 bytes_per_second=167.154M/s items_per_second=8.19994M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:64         3389 ns         3299 ns       214331 bytes_per_second=374.309M/s items_per_second=19.3972M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:512       25683 ns        21360 ns        33753 bytes_per_second=458.051M/s items_per_second=23.9706M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:1024      40311 ns        39316 ns        17610 bytes_per_second=497.237M/s items_per_second=26.0454M/s

Before

BM_ReadOffsetIndex/num_pages:8                          980 ns          836 ns       820749 bytes_per_second=106.102M/s items_per_second=9.57045M/s
BM_ReadOffsetIndex/num_pages:64                        3735 ns         3546 ns       198467 bytes_per_second=206.025M/s items_per_second=18.0497M/s
BM_ReadOffsetIndex/num_pages:512                      31427 ns        26145 ns        28486 bytes_per_second=238.919M/s items_per_second=19.583M/s
BM_ReadOffsetIndex/num_pages:1024                     48456 ns        47966 ns        14038 bytes_per_second=262.643M/s items_per_second=21.3483M/s
BM_ReadColumnIndex<Int64Type>/num_pages:8              1224 ns         1173 ns       625894 bytes_per_second=139.003M/s items_per_second=6.81895M/s
BM_ReadColumnIndex<Int64Type>/num_pages:64             3920 ns         3892 ns       176412 bytes_per_second=317.285M/s items_per_second=16.4422M/s
BM_ReadColumnIndex<Int64Type>/num_pages:512           25308 ns        24824 ns        28486 bytes_per_second=394.119M/s items_per_second=20.6249M/s
BM_ReadColumnIndex<Int64Type>/num_pages:1024          49556 ns        47995 ns        14693 bytes_per_second=407.321M/s items_per_second=21.3356M/s
BM_ReadColumnIndex<DoubleType>/num_pages:8             1234 ns         1160 ns       650715 bytes_per_second=140.642M/s items_per_second=6.89934M/s
BM_ReadColumnIndex<DoubleType>/num_pages:64            4716 ns         4138 ns       178391 bytes_per_second=298.467M/s items_per_second=15.467M/s
BM_ReadColumnIndex<DoubleType>/num_pages:512          28596 ns        25773 ns        25998 bytes_per_second=379.605M/s items_per_second=19.8654M/s
BM_ReadColumnIndex<DoubleType>/num_pages:1024         48784 ns        47841 ns        14408 bytes_per_second=408.636M/s items_per_second=21.4045M/s
BM_ReadColumnIndex<FLBAType>/num_pages:8               1169 ns         1156 ns       598352 bytes_per_second=141.084M/s items_per_second=6.92104M/s
BM_ReadColumnIndex<FLBAType>/num_pages:64              4052 ns         3924 ns       176630 bytes_per_second=314.697M/s items_per_second=16.3081M/s
BM_ReadColumnIndex<FLBAType>/num_pages:512            27219 ns        25592 ns        29032 bytes_per_second=382.301M/s items_per_second=20.0064M/s
BM_ReadColumnIndex<FLBAType>/num_pages:1024           46800 ns        46665 ns        14965 bytes_per_second=418.933M/s items_per_second=21.9438M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:8          1084 ns         1074 ns       642573 bytes_per_second=151.872M/s items_per_second=7.45026M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:64         3464 ns         3420 ns       201861 bytes_per_second=361.132M/s items_per_second=18.7144M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:512       21252 ns        20953 ns        33105 bytes_per_second=466.948M/s items_per_second=24.4362M/s
BM_ReadColumnIndex<ByteArrayType>/num_pages:1024      41540 ns        41089 ns        17490 bytes_per_second=475.776M/s items_per_second=24.9213M/s

It's about 20% faster here.

mapleFU avatar May 17 '24 06:05 mapleFU

Issue resolved by pull request 41703 https://github.com/apache/arrow/pull/41703

pitrou avatar May 22 '24 17:05 pitrou