vllm
vllm copied to clipboard
[Bug]: prefill/prefix FP8 triton kernel for opt-125m - an illegal memory access was encountered
Your current environment
As of merging https://github.com/vllm-project/vllm/pull/7208
🐛 Describe the bug
Illegal memory access for facebook/opt-125m
Specifically one of these errors:
RuntimeError: Triton Error [CUDA]: an illegal memory access was encountered
RuntimeError: CUDA error: an illegal memory access was encountered