Yunsong Wang

Results 178 comments of Yunsong Wang

> INT96 has been long deprecated, so I don't see much use in trying to get statistics to work with them. In fact, the parquet thrift file specifically [says](https://github.com/apache/parquet-format/blob/e91ab5e892391bd179d3c16b7c1cf4fbaeebbfe7/src/main/thrift/parquet.thrift#L972) the...

Yeah, using `std::optional` to properly handle optional field reading/writing was on my TODO list but then got distracted by other tasks.

_**Question to reviewers**: is it worth doing runtime dispatching based on whether nested types are involved or not?_ Benchmark results are shown below, TLDR: - The hash table size is...

@vyasr Thanks for the review, have you reviewed both cpp and python or do I need another cpp/python approval to merge the PR?

@vyasr @shwina I've updated the doc as suggested and marked this work as a breaking change. Thanks for your comments.

Notes: Users cannot specify the function name with `CUDF_FUNC_RANGE`: ![Screenshot from 2024-02-13 11-51-04](https://github.com/rapidsai/cudf/assets/12716979/ee70398d-c952-42b7-8082-0826cec2e9e5) Custom ranges can be done via `cudf::thread_range`, i.e. `hash_join::inner_join`: ![Screenshot from 2024-02-13 11-49-56](https://github.com/rapidsai/cudf/assets/12716979/e7da897d-2322-4b17-b905-b2fddbfad6a4)

> These type check functions are now employed throughout the code base instead of raw checks like a.type() == b.type() Maybe out of the scope of the current PR but...

We need to also make sure the dependent headers of `data_structure.cuh/hpp` (the example shown below) can work with non-CUDA compilers, i.e. separating declarations with implementations and guarding all implementations with...