Zhewen Li issues

Repositories
Issues
Comments

Results 2 issues of


                                            Zhewen Li

[Bug fix] ROCm FlashAttention: add missing `full_scales` argument to Triton wrapper

The recent change in PR #15734 adds a full_scales tensor to the call site in rocm_flash_attn.py. However, _attention.forward in attention/ops/triton_flash_attention.py still accepts only 12 positional arguments. This mismatch causes: ```...

needs-rebase

[Kernel][Moe Configs] llama4 maverick fp8 moe config tp8 on mi325

## Purpose AMD CI is using mi325, but the MoE config is not added: ``` WARNING [fused_moe.py:886] Using default MoE config. Performance might be sub-optimal! Config file not found at...

ready

llama