Lawrence Mitchell

Results 226 comments of Lawrence Mitchell

As mentioned on the other PR, I think this is great, thanks for doing it! I just had one bikeshed colour policy question: ``` - Regex: '^

In current 23.02 the segmentation faults are gone (good!). But it continues to be the case that none of these examples completes successfully. It appears that `read_parquet` and `to_parquet` don't...

@GregoryKimball I think this probably needs some handling in libcudf, although I _guess_ we could special-case empty dataframes in python, it feels hacky.

An update, with 24.02: Case 1: segfaults Case 2: `KeyError: '__index_level_0__'` Case 3: Works

Could you try with this patch? ```diff diff --git a/python/cudf/cudf/_lib/utils.pyx b/python/cudf/cudf/_lib/utils.pyx index b6637e9df0..aac2fb1de0 100644 --- a/python/cudf/cudf/_lib/utils.pyx +++ b/python/cudf/cudf/_lib/utils.pyx @@ -59,7 +59,7 @@ cpdef generate_pandas_metadata(table, index): types = [] index_levels =...

To fix the segfault in read with invalid metadata, can you please try: ```diff diff --git a/python/cudf/cudf/_lib/parquet.pyx b/python/cudf/cudf/_lib/parquet.pyx index d3f5b42337..0fdf2cc287 100644 --- a/python/cudf/cudf/_lib/parquet.pyx +++ b/python/cudf/cudf/_lib/parquet.pyx @@ -297,6 +297,8 @@ cpdef...

The first benchmark there doesn't appear to be valgrind-clean which may give a hint: ``` ==45341== Warning: set address range perms: large range [0x300200000, 0x8f41ff000) (noaccess) Run: [1/24] parquet_read_io_compression [Device=0...

> @wence- I rolled back changes that merged in #15143 in case it makes it possible to merge this sooner. (Does the memcheck + dask-cudf bug only shows up with...

Test fails are due to a combo of #15261 and #15265.

I suppose I should add some tests. I can write ones that check that the result is correct, but is there any way to check that the non-stable sort would...