mwish

Results 249 comments of mwish
trafficstars

I guess this might benifit the parquet dictionary encoding, which is widely used in parquet. If nobody move forward, I'll have a try next month

Don't know why R CI build failed, some help is need...

@cyb70289 I may ask a stupid question here: ``` /arrow/cpp/src/arrow/util/byte_stream_split_internal.h:161:69: error: no matching function for call to 'xsimd::batch::batch(xsimd::batch)' 161 | static_cast(stage[2][i + 4]))); ``` In arm64 this error is raised,...

I believe I meet this problem: https://github.com/apache/arrow/pull/40335#issuecomment-1982724398 because: https://github.com/xtensor-stack/xsimd/issues/735 Should I first disable neon64? Or I can upgrade xsimd first? Or I can using other workaround? @pitrou @cyb70289

(also find some fixing like: https://github.com/xtensor-stack/xsimd/commit/836b4c359edbb34e4c4448cccd9bb4fee5e34c89 , but not released yet) And zip_lo for neon64 is merged here: https://github.com/xtensor-stack/xsimd/commit/ead07427834c82aac105d36b8671abbe915c441c I'll disable neon64 firstly

@pitrou What would you think of problems here: https://github.com/apache/arrow/pull/40335#issuecomment-1983644942 . Find 12.1.1 contains some bugs here...

After: (On My AMD 3800x), compiler using gcc 11.4 ( WSL and CLion doesn't work well with lldb, I'll upgrade it later) ``` BM_ByteStreamSplitDecode_Float_Sse2/1024 268 ns 268 ns 2597941 bytes_per_second=14.2391Gi/s...

MacOS M1 Pro, compiler using LLVM-17 ``` BM_ByteStreamSplitDecode_Float_Neon/1024 393 ns 393 ns 1781393 bytes_per_second=9.7103G/s BM_ByteStreamSplitDecode_Float_Neon/4096 1523 ns 1522 ns 459550 bytes_per_second=10.0244G/s BM_ByteStreamSplitDecode_Float_Neon/32768 13254 ns 13251 ns 52771 bytes_per_second=9.21235G/s BM_ByteStreamSplitDecode_Float_Neon/65536 26862...

> Can you also post the _Scalar numbers for comparison? Done

My AMD 3800X Scalar code benchmark: ``` BM_ByteStreamSplitDecode_Float_Scalar/1024 1321 ns 1321 ns 655835 bytes_per_second=2.88745Gi/s BM_ByteStreamSplitDecode_Float_Scalar/4096 4252 ns 4252 ns 163571 bytes_per_second=3.58879Gi/s BM_ByteStreamSplitDecode_Float_Scalar/32768 40957 ns 40957 ns 16015 bytes_per_second=2.98046Gi/s BM_ByteStreamSplitDecode_Float_Scalar/65536 92735...