composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

Results 276 composable_kernel issues
Sort by recently updated
recently updated
newest added

The namespace doesn't make sense: A host class is derived from a device class.

code quality

## Proposed changes add fmha fwd splitkv receipt for aiter c++ api fix other mha codegen receipt issue to reduce amount of instance ## Checklist Please put an `x` into...

## Proposed changes Please describe the motivation behind the pull request, whether it enables a new feature or fixes a bug. If there are associated pull requests or issues, please...

## Proposed changes Fuse activation and moe ffn1. Supported activations : 0: gelu, 1: silu, 2:swiglu. ## Checklist Please put an `x` into the boxes that apply. You can also...

## Proposed changes Introduces new MX GEMM pipeline for microscaling (MX) data types. At this time, MX FP8 has been verified. Support for more data types is coming soon. ##...

## Proposed changes paged fa for batch prefill ## Checklist Please put an `x` into the boxes that apply. You can also fill these out after creating the PR. If...

## Proposed changes Please describe the motivation behind the pull request, whether it enables a new feature or fixes a bug. If there are associated pull requests or issues, please...

## Proposed changes Add support for build ck tile examples package ## Checklist Please put an `x` into the boxes that apply. You can also fill these out after creating...

## Proposed changes Add 2:4 structured sparsity support for ck tile fp16 gemm ## Checklist Please put an `x` into the boxes that apply. You can also fill these out...

## Proposed changes Remove scratch usage from universal gemm by moving the if kbatch related condition oustide of kernel and passing memory operation enum as a template parameter ## Checklist...