Allow configuration for local inferencing.

Open UncertainUrza opened this issue 4 months ago • 1 comments

What would it take to configure Buttercup to make LLM calls against a local inference engine (i.e. Ollama, Llama.cpp, vLLm, etc)? For folks who want to experiment with this project locally without using a cloud API provider, this would be a huge boon.

It should be a relatively low lift, considering the project is already using LiteLLM as a proxy for LLM requests.

Aug 19 '25 16:08 UncertainUrza

We'll be working on this soon

Aug 20 '25 19:08 dguido