Antoine Pitrou comments

Results 823 comments of


                                            Antoine Pitrou

GH-39377: [C++] IO: Reuse same buffer in CompressedInputStream

@mapleFU Can you update to the latest git main?

GH-39377: [C++] IO: Reuse same buffer in CompressedInputStream

Here are the benchmark results here (Ubuntu 22.04, AMD Zen 2 CPU): ``` --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Non-regressions: (24) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- benchmark baseline contender change % counters CompressionInputNonZeroCopyBenchmarkDirectRead/InputBytes:65536/PerReadBytes:65536 1.102 GiB/sec 1.186 GiB/sec 7.628 {'family_index':...

GH-39377: [C++] IO: Reuse same buffer in CompressedInputStream

Slightly faster (see the percentages).

GH-20213: [C++] Implement cast to/from halffloat

For the Docker-based builds (i.e. all Linux builds), you can take a look at https://arrow.apache.org/docs/developers/continuous_integration/docker.html

GH-39377: [C++] IO: Reuse same buffer in CompressedInputStream

@ursabot please benchmark lang=R,Python

GH-39377: [C++] IO: Reuse same buffer in CompressedInputStream

@ursabot please benchmark lang=Python

GH-39377: [C++] IO: Reuse same buffer in CompressedInputStream

@ursabot please benchmark lang=R

GH-40415: [C++] Extract the primitive operations implementing Take and make them instantiable in multiple scenarios

Ran the benchmarks locally (gcc 12.3.0, AMD Ryzen 9 3900X). There are a number of improved benchmarks and also a couple sizable regressions: Details ``` -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Non-regressions: (368) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- benchmark...

GH-40415: [C++] Extract the primitive operations implementing Take and make them instantiable in multiple scenarios

Ok, there's a lot of C++ to decipher here but I'd rather see an explanation of how this is different from what already exists. Apart from calling the primitive "gather"...

GH-40415: [C++] Extract the primitive operations implementing Take and make them instantiable in multiple scenarios

My intuition is that this is primarily avoiding the cost of `bit_util::SetBitsTo` for the no-nulls case, which is good, but does not require a wholesale refactor.