Orion icon indicating copy to clipboard operation
Orion copied to clipboard

vLLM support?

Open lhl opened this issue 5 months ago • 2 comments

The docs mention that you used vLLM for inferencing, but it looks like Orion support hasn't been upstreamed yet: https://github.com/vllm-project/vllm/tree/main/vllm/model_executor/models

Can you share the model file or do you have an ETA for upstreaming the code? HF transformers inferencing is slow enough to make Orion pretty unusable even for running evals.

lhl avatar Jan 23 '24 16:01 lhl