aibrix
aibrix copied to clipboard
Run small models with vLLM CPU mode for local development testing
🚀 Feature Description and Motivation
We already have mocked app for most feature integration testing, however, this is still not that convenient in some cases. We should check whether it's possible to use small models like opt-125m with cpu only vLLM for testing
Use Case
No response
Proposed Solution
No response
In simulator (#430, #456), I used llama2-7b for CPU testing. Will this satisfy your requirement, or might we want to support opt-125m?
We can close this issue: https://github.com/vllm-project/aibrix/tree/main/development/vllm