vllm
vllm copied to clipboard
Support for Falcon-7B / 40B models
It would be great, if you can add support for Falcon models as well! Does it support onnx models today?
Currently, vLLM does not support ONNX models. Supporting Falcon is on our roadmap. Thanks for your suggestion.
@WoosukKwon When do you anticipate adding support for Falcon to vLLM?
@MotzWanted I'm working on it now. I think we can add less-optimized version of Falcon (MQA replaced by MHA) quickly (within a few days) and then develop kernels to make the model actually use MQA.
I'm waiting for supporting Falcon families, too. Thanks a lot for your works.
great, thanks
thanks, looking forward to it