vllm
vllm copied to clipboard
Support for fastchat-t5-3b-v1.0
It would be great if you could support fastchat-t5-3b-v1.0, which is a derivation of Flan-T5-XL model: https://huggingface.co/lmsys/fastchat-t5-3b-v1.0
Hi @Matthieu-Tinycoaching, thanks for bringing it up! As mentioned in #187, T5 support is definitely on our roadmap. The current blocker is its encoder-decoder architecture, which vLLM's current implementation does not support. As it requires non-trivial modifications to our system, we are currently thinking of a good design to support it in vLLM.
Closing as a duplicate of #187