composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

support both FP8 interpretations at the same time

Open jeffdaily opened this issue 9 months ago • 1 comments

Either at compile-time or run-time determine which FP8 interpretation to use.

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

  • [ ] I have added tests relevant to the introduced functionality, and the unit tests are passing locally
  • [ ] I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
  • [ ] I have added inline documentation which enables the maintainers with understanding the motivation
  • [ ] I have removed the stale documentation which is no longer relevant after this pull request
  • [ ] (If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
  • [ ] I have run clang-format on all changed files
  • [ ] Any dependent changes have been merged

Discussion

If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered

jeffdaily avatar Mar 04 '25 22:03 jeffdaily

We had to implement a workaround in pytorch to support having gfx1200 and gfx942 compiled in the same project. We release a single pytorch binary that must support all ROCm gfx targets. See link for pytorch workaround.

https://github.com/pytorch/pytorch/pull/148496

jeffdaily avatar Mar 04 '25 22:03 jeffdaily

Hello @jeffdaily @illsilin this PR is stale for several months. Please notify me whether to close it or to move forward with assigning a code review.

For your PR to be considered for code review, please fill out the appropriate descriptions, checkboxes and relevant discussions.

cgmillette avatar Oct 08 '25 23:10 cgmillette

this is no longer required. we support both fp8 types in the lib when building for multiple architectures.

illsilin avatar Oct 09 '25 01:10 illsilin