vLLM support
Are there any updates on this one?
+1
+1
+1
+1
+1
So, could you provide advice so that I can make custom modifications on vLLM myself (llama2 70b)?
fwiw, i know this is about vLLM, but you can run medusa on tgi using --speculate 3
fwiw, i know this is about vLLM, but you can run medusa on tgi using --speculate 3
hello,how can I pass medusa model and base model args when I use medusa on tgi.
fwiw, i know this is about vLLM, but you can run medusa on tgi using --speculate 3
hello,how can I pass medusa model and base model args when I use medusa on tgi.
Just pass the medusa model repo (as you would with any other model) and then add on --speculate 2
You can try this template: https://runpod.io/gsc?template=2xpg09eenv&ref=jmfkcdio
Thanks a lot !!!!
how to use medusa based on vllm or sglang?