composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

Results 276 composable_kernel issues
Sort by recently updated
recently updated
newest added

## Proposed changes Please describe the motivation behind the pull request, whether it enables a new feature or fixes a bug. If there are associated pull requests or issues, please...

## Proposed changes Summary: - Add Epilogue to support CShuffle + Reduction - Add device implementation of Gemm Reduce for WMMA - Add instances - Fix tests (error checking result...

## Proposed changes This is a first look at a potential way we can unify the mfma and wmma backends together in ck_tile. The PR is in draft, so there...

WIP

## Proposed changes Added merging of multiple forward convolution groups into a single GEMM batch. The majority of the required components were already available and the only major code changes...

## Proposed changes 1. Provide a `add_missing_copyrights.py` script to update files with missing copyright headers 2. Introduce missing copyright headers where needed ## Checklist Please put an `x` into the...

## Proposed changes Added an example of bf16*fp4 gemm, where fp4 and fp4_scale are in uint8 data format. In the pipeline, matrix B(fp4) will be dequantized to bf16 before performing...

This PR brings an implementation of HSTU attention on ck_tile. HSTU attention is very different from the `fmha` implemented in ck_tile, for details, please refer to the [hstu paper](https://arxiv.org/html/2402.17152v2) The...

## Proposed changes Add TileSize kM0=64 in fmha fwd kernel, for xformer medium size shape consuming ## Checklist Please put an `x` into the boxes that apply. You can also...

## Proposed changes Please describe the motivation behind the pull request, whether it enables a new feature or fixes a bug. If there are associated pull requests or issues, please...

## Proposed changes Add supported grouped and batched GEMM test cases ## Checklist - [x] I have added tests relevant to the introduced functionality, and the unit tests are passing...