OpenLLM
OpenLLM copied to clipboard
bug: issue getting started with OpenLLM
Describe the bug
Found OpenLLM recently while researching abstractions for local model execution and abstraction. The OpenLLM repo stood out. I may have initially misunderstood the options for using the library. After reading the README for the repository, being able to use an open source model via docker was enticing. Although, after reading the description for the repository I am left wondering if it is possible to use the library without some form of bentoml.
Questions:
1. Did I misunderstand the instructions or have a misunderstanding of the tool❓ 2. Does OpenLLM require some form of bentoml cloud❓
What I found when following the Getting Started section of the documentation is, the execution does not give the same output as listed. Details about the execution can be found in the to reproduce section of this issue.
To reproduce
Steps followed:
- Clean ~/.openllm configuration folder
- Create new folder for project
- Initialize folder with uv with python 3.12.8
- Add openllm to project
- Execute openllm hello
- Output describes updating default and nightly, but does not seem to continue as expected
Steps
1. Clean ~/.openllm configuration folder
rm -rf ~/.openllm
2 Create new folder for project
# ~/local-gpt
mkdir openllm-api
cd openllm-api
3. Initialize folder with uv with python 3.12.8
# ~/local-gpt/openllm-api
uv init -p 3.12
Initialized project openllm-api
4. Add openllm to project
# ~/local-gpt/openllm-api
uv add openllm
Using CPython 3.12.8 Creating virtual environment at: .venv Resolved 98 packages in 1.36s Prepared 94 packages in 16.38s Installed 94 packages in 244ms
- a2wsgi==1.10.8
- aiohappyeyeballs==2.6.1
- aiohttp==3.11.18
- aiosignal==1.3.2
- aiosqlite==0.21.0
- annotated-types==0.7.0
- anyio==4.9.0
- appdirs==1.4.4
- asgiref==3.8.1
- attrs==25.3.0
- bentoml==1.4.8
- cattrs==23.1.2
- certifi==2025.4.26
- charset-normalizer==3.4.2
- click==8.1.8
- click-option-group==0.5.7
- cloudpickle==3.1.1
- deprecated==1.2.18
- distro==1.9.0
- dulwich==0.22.8
- filelock==3.18.0
- frozenlist==1.6.0
- fs==2.4.16
- fsspec==2025.3.2
- h11==0.16.0
- hf-xet==1.1.0
- httpcore==1.0.9
- httpx==0.28.1
- httpx-ws==0.7.2
- huggingface-hub==0.30.2
- idna==3.10
- importlib-metadata==8.6.1
- jinja2==3.1.6
- jiter==0.9.0
- kantoku==0.18.3
- markdown-it-py==3.0.0
- markupsafe==3.0.2
- mdurl==0.1.2
- multidict==6.4.3
- numpy==2.2.5
- nvidia-ml-py==12.570.86
- openai==1.73.0
- openllm==0.6.30
- opentelemetry-api==1.32.1
- opentelemetry-instrumentation==0.53b1
- opentelemetry-instrumentation-aiohttp-client==0.53b1
- opentelemetry-instrumentation-asgi==0.53b1
- opentelemetry-sdk==1.32.1
- opentelemetry-semantic-conventions==0.53b1
- opentelemetry-util-http==0.53b1
- packaging==25.0
- pathspec==0.12.1
- pip-requirements-parser==32.0.1
- prometheus-client==0.21.1
- prompt-toolkit==3.0.51
- propcache==0.3.1
- psutil==7.0.0
- pyaml==25.1.0
- pydantic==2.11.4
- pydantic-core==2.33.2
- pygments==2.19.1
- pyparsing==3.2.3
- python-dateutil==2.9.0.post0
- python-dotenv==1.1.0
- python-json-logger==3.3.0
- python-multipart==0.0.20
- pyyaml==6.0.2
- pyzmq==26.4.0
- questionary==2.1.0
- requests==2.32.3
- rich==14.0.0
- schema==0.7.7
- setuptools==80.3.0
- shellingham==1.5.4
- simple-di==0.1.5
- six==1.17.0
- sniffio==1.3.1
- starlette==0.46.2
- tabulate==0.9.0
- tomli-w==1.2.0
- tornado==6.4.2
- tqdm==4.67.1
- typer==0.15.3
- typing-extensions==4.13.2
- typing-inspection==0.4.0
- urllib3==2.4.0
- uv==0.7.2
- uvicorn==0.34.2
- watchfiles==1.0.5
- wcwidth==0.2.13
- wrapt==1.17.2
- wsproto==1.2.0
- yarl==1.20.0
- zipp==3.21.0
5. Execute openllm hello
# ~/local-gpt/openllm-api
uv run openllm hello
updating repo default updating repo nightly
Logs
Output provided above in "to reproduce" section
Environment
Using local environment, not cloud. All model execution is directly through ollama, but am interrested in llama.cpp as well. Does openLLM require bentoml?
Platform: ollama
System information (Optional)
Enviornment
platform: Windows 11 / WSL2 / Docker Compose (all execution done via WSL2) memory: 128GB Ram gpu: RTX 3080 Shell: bash Package Manager: uv Python Version: CPython 3.12.8