Run small models with vLLM CPU mode for local development testing

Open Jeffwan opened this issue 1 year ago • 1 comments

🚀 Feature Description and Motivation

We already have mocked app for most feature integration testing, however, this is still not that convenient in some cases. We should check whether it's possible to use small models like opt-125m with cpu only vLLM for testing

Use Case

No response

Proposed Solution

No response

Nov 20 '24 21:11 Jeffwan

In simulator (#430, #456), I used llama2-7b for CPU testing. Will this satisfy your requirement, or might we want to support opt-125m?

Dec 02 '24 19:12 zhangjyr

We can close this issue: https://github.com/vllm-project/aibrix/tree/main/development/vllm

Apr 28 '25 21:04 varungup90