awkward icon indicating copy to clipboard operation
awkward copied to clipboard

feat: combinations kernels

Open ManasviGoyal opened this issue 1 year ago • 2 comments

ManasviGoyal avatar Jun 12 '24 14:06 ManasviGoyal

@ManasviGoyal whenever you've got something to try out lemme know! This should unlock most of the analysis query examples.

lgray avatar Jun 27 '24 14:06 lgray

@ManasviGoyal whenever you've got something to try out lemme know! This should unlock most of the analysis query examples.

Sure! I am still trying a few approaches.

ManasviGoyal avatar Jun 27 '24 15:06 ManasviGoyal

@lgray I have added the kernels for ak.combinations. The current implementation only works for n = 2. Can you please test it on the example queries? Let me know if there are any issues. Thanks!

Also I noticed some issues with argmin and argmax so It would be good to test the relevant queries once this fix is added.

ManasviGoyal avatar Jul 01 '24 18:07 ManasviGoyal

Great, I'll give it a try soon!

lgray avatar Jul 01 '24 18:07 lgray

These implementations look good and they have tests. I've verified that all of the tests run correctly on my GPU. Do the tests include cases that cross CUDA blocks?

This is only for ak.combinations with n=2, right? Is there an error message if n != 2 and the backend is "cuda"?

If yes, then this PR is ready to merge. Thanks!

Yes, I have added tests for ak.combinations and ak.argcombinations for cross CUDA blocks. I have added an error message "not implemented for given n" for n != 2 inside the CUDA C++ code.

ManasviGoyal avatar Jul 03 '24 13:07 ManasviGoyal

Then it's ready to merge. Thanks!

jpivarski avatar Jul 03 '24 13:07 jpivarski

Then it's ready to merge. Thanks!

Once the macOS issue is solved I'll merge it. Thanks!

ManasviGoyal avatar Jul 03 '24 14:07 ManasviGoyal

Then it's ready to merge. Thanks!

Once the macOS issue is solved I'll merge it. Thanks!

I think, you can merge it now - MacOS is not relevant to the GPU kernels - it does not have CUDA. :-)

ianna avatar Jul 03 '24 15:07 ianna

@ianna For me it says that the required macOS tests must pass before I can merge. So I am unable to merge.

ManasviGoyal avatar Jul 03 '24 15:07 ManasviGoyal

The jobs that haven't succeeded haven't even started. I just re-triggered it, and they're still not starting.

They're asking for MacOS 11, and I think that was retired on June 30, so that's why they're not starting (although GitHub ought to be failing with an error message if they really don't support an OS requested by the workflow. Maybe we're still in a transition period).

This is updated to main, right? Neither #3164 nor #3167 have been merged yet, and I haven't updated the set of required tests to whichever of those is successful. That's a blocker for merging this (and any other) PR.

Although the changes here are not supposed to affect CPU implementations or MacOS, they can, in principle, so it would be proper to let these PRs wait for the MacOS fix before merging them. (This has some new kernel tests and changes some unit test infrastructure.) If necessary, we can manually test it on a Mac, maybe @ianna's (Intel) and mine (Apple Silicon), and then bypass the protection rules.

But the MacOS fix will be needed eventually. Is it looking promising right now, @ianna? Do we need more help, e.g. from @henryiii or others?

jpivarski avatar Jul 03 '24 17:07 jpivarski

The jobs that haven't succeeded haven't even started. I just re-triggered it, and they're still not starting.

They're asking for MacOS 11, and I think that was retired on June 30, so that's why they're not starting (although GitHub ought to be failing with an error message if they really don't support an OS requested by the workflow. Maybe we're still in a transition period).

This is updated to main, right? Neither #3164 nor #3167 have been merged yet, and I haven't updated the set of required tests to whichever of those is successful. That's a blocker for merging this (and any other) PR.

Although the changes here are not supposed to affect CPU implementations or MacOS, they can, in principle, so it would be proper to let these PRs wait for the MacOS fix before merging them. (This has some new kernel tests and changes some unit test infrastructure.) If necessary, we can manually test it on a Mac, maybe @ianna's (Intel) and mine (Apple Silicon), and then bypass the protection rules.

But the MacOS fix will be needed eventually. Is it looking promising right now, @ianna? Do we need more help, e.g. from @henryiii or others?

Yes, I have pinged @henryiii on the macos build PR.

ianna avatar Jul 03 '24 18:07 ianna

@jpivarski As discussed in the meeting, I have added the CuPy implementation of n = 3, 4, 5 cases (non lexicographic order, replacement=False) as a study. Thanks!

ManasviGoyal avatar Jul 09 '24 13:07 ManasviGoyal

I just checked all of the CUDA tests for this branch in its current state. All is good!

jpivarski avatar Jul 17 '24 17:07 jpivarski

I haven't been able to test this properly at analysis scale because of the numba issue we ran into. Without confirming the final plots it appears to be making combinations as expected.

lgray avatar Jul 17 '24 23:07 lgray

@ianna if the tests pass, this one can be merged. Thanks!

ManasviGoyal avatar Jul 19 '24 12:07 ManasviGoyal