optillm icon indicating copy to clipboard operation
optillm copied to clipboard

Optimizing inference proxy for LLMs

Results 21 optillm issues
Sort by recently updated
recently updated
newest added

Hello, I'm doing minion now https://github.com/femto/minion, a quick demo here: https://www.youtube.com/watch?v=-LW7TCMUfLs&t=33s, basically I want to do is it can handle arbitrary type of queries, math, coding, qa, long novel writing,...

**Description:** Using the MOA approach in the Ollama API via an OpenAI-compatible endpoint results in a `list index out of range` error. The request fails to return a valid response....

documentation

### Symptoms I used a llama-server with OPENAI_API_KEY='no_key', but it doesn't work: optillm.py was accessing the OpenAI server, not the llama-server. ``` 2024-10-13 21:03:14,128 - INFO - HTTP Request: POST...

bug

https://lightning.ai/studios

enhancement
good first issue

Initially brought up in #8 having a GUI would make it easier to visualize and compare different approaches.

enhancement
help wanted

Hi there, This pull request shares a security update on optillm. We also have an entry for optillm in our directory, MseeP.ai, where we provide regular security and trust updates...

Add documentation to show how to use optillm with local inference server for getting logits. This is a commonly requested feature in ollama https://github.com/ollama/ollama/issues/2415 that is already supported in optillm...

documentation

Support the /completions endpoint for inbuilt inference server. _Originally posted by @SeriousJ55 in https://github.com/codelion/optillm/discussions/168#discussioncomment-12403092_

Hi 👋🏻 Thanks for your work on OptiLLM! I've worked on integrating it to [Harbor](https://github.com/av/harbor) and come across a couple of nice-to-haves that might make project friendlier under specific conditions....

enhancement