Gregory Kimball comments

Results 146 comments of


                                            Gregory Kimball

[FEA] Get Series.list offsets / Construct Series of lists from offsets and values

Hello @shwina and @bdice, [bucketize](https://github.com/pytorch/torcharrow/blob/main/csrc/velox/functions/rec/bucketize.h) is a feature that we might unlock if we could construct a list column from offsets and values. Bucketize is performed on leaves and uses...

[FEA] Get Series.list offsets / Construct Series of lists from offsets and values

To my surprise the `explode` trick from #10967 works here as well: ``` def bucketize(a, buckets): a_x = a.explode() b = a_x * 0 for k in buckets: b +=...

ppc64le - undefined symbol: _ZN4cudf2io6detail12cufile_input10read_asyncEmmPhN3rmm16cuda_stream_viewE

@rnukala1 is this issue still relevant?

[FEA] Support drop_duplicates on Series containing list objects

With the addition of list column support for `distinct` in libcudf (#10641), this issue just needs python bindings.

[FEA] Support drop_duplicates on Series containing list objects

In 22.06 `drop_duplicates` uses a sort-based algorithm and relies on the lexicographic comparator. We expect this will be closed by #11129 ``` import cudf df = cudf.DataFrame({"a": [[1, 3, 5,...

[FEA] Enable Page-level filtering based on the ColumnIndex feature from parquet 1.11

> API where we can send compressed pages and metadata to CUDF for decoding. The metadata would include things like the file and row group that the pages came from...

[BUG] Some custom dask aggregations fail with dask_cudf dataframes

Would this be possible with `apply` instead of `agg`? Is there an extension of #11452 that could accept some custom aggregations?

Add full 24-bit dictionary support to Parquet writer

Thanks @etseidl for suggesting this change. Please excuse the delay, we will be taking another look for the 22.10 release.

[BUG] Truncated dataframe when reading parquet from s3fs

Please feel free to re-open if the issue is not solved. Thank you @rjzamora for your contribution.

[FEA] The `--profile` flag should disable CUPTI metrics

Thank you @PointKernel and @davidwendt for investigating the missing/inconsistent GPU metrics data with nvbench. If it's not the CUPTI dependency, what is the root cause of the problem? Lowering the...