composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

Implement splitk support device_grouped_conv_fwd_multiple_abd_xdl_cshuffle_v3

Open MiloLurati opened this issue 4 months ago • 1 comments

Proposed changes

These changes implement SplitK support for device_grouped_conv_fwd_multiple_abd_xdl_cshuffle_v3.

The implementation supports both one-stage and two-stage execution based on data type. Three execution paths are available:

  • Two-stage with SplitK: When EDataType is FP16 or BF16, accumulate in workspace using FP32, then perform final elementwise operation to cast to output.
  • One-stage with SplitK: When EDataType is not FP16 or BF16, accumulate directly to output.
  • One-stage without SplitK: Original behavior.

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

  • [x] I have added tests relevant to the introduced functionality, and the unit tests are passing locally
  • [x] I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
  • [x] I have added inline documentation which enables the maintainers with understanding the motivation
  • [x] I have removed the stale documentation which is no longer relevant after this pull request
  • [ ] (If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
  • [x] I have run clang-format on all changed files
  • [x] Any dependent changes have been merged

Discussion

If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered

MiloLurati avatar Aug 12 '25 15:08 MiloLurati

@MiloLurati Please resolve the conflicts and sync to develop branch. Thanks!

cgmillette avatar Oct 30 '25 17:10 cgmillette

@MiloLurati marking this PR as stale as it hasn't been responded to in over 2 weeks. Further inactivity will cause this PR to be closed.

cgmillette avatar Nov 13 '25 18:11 cgmillette

Closed due to inactivity

cgmillette avatar Dec 04 '25 18:12 cgmillette