Vyas Ramasubramani
Vyas Ramasubramani
It does indeed look like simply commenting out the `list_view` specialization of the code makes the struct code path faster. The overhead of constructing the Dremel data is negligible, I...
> > It does indeed look like simply commenting out the `list_view` specialization of the code makes the struct code path faster. The overhead of constructing the Dremel data is...
I spent some time today investigating different versions of the compilation and got very suspicious at how my benchmarks could possibly have shown that primitive types were unaffected when every...
To minimize the amount of friction between this PR, ongoing Parquet development, and the desire to refactor Dremel encoding code, I have separated out the Dremel changes into #11461. We...
The comparison results are in [this earlier comment](https://github.com/rapidsai/cudf/pull/11129#issuecomment-1198791553). I always put benchmark results into dropdowns that look like Benchmarks Benchmarks go here because I find GH PR conversations filled with...
So I've tried giving the compiler more explicit hints that the list and struct comparator code paths will never call to each other (e.g. by separating out the primitive comparison...
I'm afraid that I wasn't quite able to get through this today. There are two main outstanding tasks here: 1. Split the `device_row_comparator` into two implementations, one that supports nesting...
I plan to pick this back up this week.
OK, I've managed to resolve the two major concerns above. Here are the new benchmarks for structs: Benchmarks ``` ## [0] Tesla T4 | NumRows | Depth | Nulls |...
> As per my argument above, if this PR changes the performance of the existing APIs (like binary search and sorting) then I would suggest not to merge this until...