Hosang issues

Repositories
Issues
Comments

Results 3 issues of


                                            Hosang

[ROCm] Enable custom paged attention kernel for Navi3/4

Add additional custom paged attention kernels for AMD Navi 3x/4x GPU support based on PR: https://github.com/vllm-project/vllm/pull/12348 Due to the differences in architecture from MI, specific instructions and detailed logic have...

ci/build

[ROCm] Fix kernel cache miss in Triton FA

- resolved cache miss issue during triton flash attention calls by fixing `MAX_SEQLENS_Q/K` to `0` - `MAX_SEQLENS_Q/K` differs at each step, resulting in different key values and compilation for the...

[Fix][ROCm] Remove unused variables to fix build error on GFX11/12

## Essential Elements of an Effective PR Description Checklist - [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x]...

rocm

ready