Yi Zhang
Yi Zhang
In the following example, I insert two keys and do a search, I expect get right value, but it will raising error. ``` #include "art.h" #include "assert.h" #include "stdio.h" void...
## Motivation This PR adding support for Qwen2-VL model, which is also supported by vllm [(here)](https://github.com/vllm-project/vllm/pull/7905) and Imdeploy [(here)](https://github.com/InternLM/lmdeploy/pull/2449) ## Modifications 1. add conversation template and chat template for Qwen2-vl,which...
**What is your question?** Hi, I try to use `KernelTmaWarpSpecializedCooperativeFP8BlockScaledAccum` to implement deepseekv3 block-wise FP8 as well as per-token-per-128-channel, but I find it does not work. While when I just...