sglang
sglang copied to clipboard
[ROCm] Add additional block quant GEMM tuning configs for AMD GPUs.
Modifications
Add additional block quant GEMM tuning configs for AMD GPUs.
Checklist
- [X] Format your code according to the Code Formatting with Pre-Commit.
Hi @whchung, do we have profiling comparison I am really interested in the parameter choosing of "BLOCK_SIZE_N" between 16 and 64.
In the last year we have paper fully study the parameter choosing. The study shows that parameters typically 16, 64, 128, which is deep related to memory transaction bandwidth.