cudf icon indicating copy to clipboard operation
cudf copied to clipboard

Implement distributed sorted for ``cudf_polars``

Open seberg opened this issue 6 months ago • 4 comments

This implements distributed sorting for cudf-polars, it should work but is missing new sorting tests and I am also not quite sure that local tests exercised the rapidsmpf paths properly.

Right now, it is structured around introducing a ShuffleSorted IR but maybe it would be nicer to merge the steps further.

It does some smaller refactors to the shuffling to re-use some code there (but not actually move too much sorting related logic into the file).

Main missing things:

  • [ ] The biggest thing missing for sure is new tests which may flush out a bug or two.
  • As a follow up maybe: I use concat in both paths (rapidsmpf and not) for merging the exchanged chunks. If the sort isn't stable, then this should use plc...merge, this will require changing rapidsmpf, a bit more.
  • I have not modified zlice handling (top/bottom limits). For small values there is probably little point but for large result slices this needs to be threaded in.

seberg avatar May 21 '25 10:05 seberg

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar May 21 '25 10:05 copy-pr-bot[bot]

Moved to 25.08

vyasr avatar May 21 '25 17:05 vyasr

@seberg @rjzamora will this resolve https://github.com/rapidsai/cudf/issues/18527?

vyasr avatar May 22 '25 22:05 vyasr

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar May 30 '25 17:05 copy-pr-bot[bot]

@seberg - Just FYI: I addressed a few merge conflicts here to keep the work from getting stale.

Sorry for being slow to review this. FWIW, I think the PR is ready (or very close).

Do you think there is unnecessary risk in merging this into 25.08 if it passes through review quickly, or does early 24.10 sound better?

rjzamora avatar Jul 16 '25 16:07 rjzamora

Thanks for taking a look! I am pretty confident it not returning bad results and I don't think it's a giant user base yet either way. Suppose that is to say, I am fine either way, put it in now or be on the safe side and just wait for branching (and by the time it's shipped a few parts may be optimized a bit more then).

seberg avatar Jul 16 '25 17:07 seberg

/ok to test

Matt711 avatar Jul 17 '25 16:07 Matt711

/ok to test

@Matt711, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

copy-pr-bot[bot] avatar Jul 17 '25 16:07 copy-pr-bot[bot]

/ok to test 1e2b60083637e5f2bf028a3f089dc46d3a253eea

Matt711 avatar Jul 17 '25 16:07 Matt711

/ok to test c343adc

rjzamora avatar Jul 21 '25 13:07 rjzamora

/ok to test 06690966ee6690a16991824f31fe5ccde17d1941

vyasr avatar Jul 21 '25 16:07 vyasr

/ok to test f26910bade46d77df8ca7ae5511ce43090d7ac71

Matt711 avatar Aug 20 '25 19:08 Matt711

/ok to test 0fd6db3429fbb96e6a61a2c1a278c662da0002f3

Matt711 avatar Aug 21 '25 01:08 Matt711

/ok to test 165726b

rjzamora avatar Aug 21 '25 03:08 rjzamora

/ok to test 88ba934

rjzamora avatar Aug 21 '25 14:08 rjzamora

/ok to test e6c3ecdfcc61d1221509f502a5bd8eda985699e5

Matt711 avatar Aug 28 '25 13:08 Matt711

/ok to test 8a117c8

rjzamora avatar Aug 29 '25 16:08 rjzamora

/ok to test 3053c65

rjzamora avatar Aug 29 '25 19:08 rjzamora

/ok to test 5687414

rjzamora avatar Aug 29 '25 20:08 rjzamora

Small update: I used the latest version of this branch to run the PDS-H benchmark at SF-3000 on 8xH100 (5 iterations for each query). The run times were consistent with branch-25.10 (so no obvious problems).

rjzamora avatar Aug 29 '25 21:08 rjzamora

Is this PR waiting on a review at this point?

vyasr avatar Sep 04 '25 01:09 vyasr

Is this PR waiting on a review at this point?

It would be good to get another review from @Matt711 since he asked for changes.

rjzamora avatar Sep 04 '25 12:09 rjzamora

Is this PR waiting on a review at this point?

It would be good to get another review from @Matt711 since he asked for changes.

enqueued

Matt711 avatar Sep 04 '25 14:09 Matt711

Small update: I used the latest version of this branch to run the PDS-H benchmark at SF-3000 on 8xH100 (5 iterations for each query). The run times were consistent with branch-25.10 (so no obvious problems).

Do any of the pdsh queries currently (without this PR) fallback because Sort does not support multiple partitions. at SF3000?

Matt711 avatar Sep 04 '25 15:09 Matt711

/ok to test 23d5cda

rjzamora avatar Sep 08 '25 19:09 rjzamora

/ok to test 7e89c87

rjzamora avatar Sep 09 '25 13:09 rjzamora

Small update: I used the latest version of this branch to run the PDS-H benchmark at SF-3000 on 8xH100 (5 iterations for each query). The run times were consistent with branch-25.10 (so no obvious problems).

Do any of the pdsh queries currently (without this PR) fallback because Sort does not support multiple partitions. at SF3000?

@Matt711 - Sorry I missed this earlier. Query 13 does hit the fall-back logic at sf3k. I don't think any other queries will use the multi-partition sort. I can do one last test on an 8xH100 cluster before we merge this.

rjzamora avatar Sep 09 '25 18:09 rjzamora

Small update: I used the latest version of this branch to run the PDS-H benchmark at SF-3000 on 8xH100 (5 iterations for each query). The run times were consistent with branch-25.10 (so no obvious problems).

Do any of the pdsh queries currently (without this PR) fallback because Sort does not support multiple partitions. at SF3000?

@Matt711 - Sorry I missed this earlier. Query 13 does hit the fall-back logic at sf3k. I don't think any other queries will use the multi-partition sort. I can do one last test on an 8xH100 cluster before we merge this.

I did too 😅. Thanks!

Matt711 avatar Sep 09 '25 18:09 Matt711

/merge

rjzamora avatar Sep 09 '25 19:09 rjzamora

Thanks @seberg! (Sorry for taking so long to review)

rjzamora avatar Sep 09 '25 19:09 rjzamora