flashinfer icon indicating copy to clipboard operation
flashinfer copied to clipboard

ValueError: The dtype of q torch.bfloat16 does not match the q_data_type torch.float16 specified in plan function.

Open Godlovecui opened this issue 1 year ago • 1 comments

Hello, I install flashinfer by AOT, where to modify q_data_type into torch.bfloat16 in plan function? image

Thank you~

Godlovecui avatar Nov 25 '24 08:11 Godlovecui

I think currently vllm uses the v0.1.5 style api and you can specify the q_data_type in the begin_forward function.

yzh119 avatar Nov 25 '24 09:11 yzh119

Closing, since it looks like @yzh119 has answered this, and it's an old question. Please re-open if it's still an issue.

sricketts avatar Sep 30 '25 01:09 sricketts