sglang icon indicating copy to clipboard operation
sglang copied to clipboard

[ROCm] Add additional block quant GEMM tuning configs for AMD GPUs.

Open whchung opened this issue 9 months ago • 1 comments

Modifications

Add additional block quant GEMM tuning configs for AMD GPUs.

Checklist

whchung avatar Feb 16 '25 22:02 whchung

Hi @whchung, do we have profiling comparison I am really interested in the parameter choosing of "BLOCK_SIZE_N" between 16 and 64.

In the last year we have paper fully study the parameter choosing. The study shows that parameters typically 16, 64, 128, which is deep related to memory transaction bandwidth.