OpenLLM icon indicating copy to clipboard operation
OpenLLM copied to clipboard

bug: issue getting started with OpenLLM

Open iamobservable opened this issue 6 months ago • 2 comments

Describe the bug

Found OpenLLM recently while researching abstractions for local model execution and abstraction. The OpenLLM repo stood out. I may have initially misunderstood the options for using the library. After reading the README for the repository, being able to use an open source model via docker was enticing. Although, after reading the description for the repository I am left wondering if it is possible to use the library without some form of bentoml.

Questions:

1. Did I misunderstand the instructions or have a misunderstanding of the tool❓ 2. Does OpenLLM require some form of bentoml cloud❓

What I found when following the Getting Started section of the documentation is, the execution does not give the same output as listed. Details about the execution can be found in the to reproduce section of this issue.

To reproduce

Steps followed:

  1. Clean ~/.openllm configuration folder
  2. Create new folder for project
  3. Initialize folder with uv with python 3.12.8
  4. Add openllm to project
  5. Execute openllm hello
  6. Output describes updating default and nightly, but does not seem to continue as expected

Steps


1. Clean ~/.openllm configuration folder

rm -rf ~/.openllm

2 Create new folder for project

# ~/local-gpt
mkdir openllm-api
cd openllm-api

3. Initialize folder with uv with python 3.12.8

# ~/local-gpt/openllm-api
uv init -p 3.12

Initialized project openllm-api

4. Add openllm to project

# ~/local-gpt/openllm-api
uv add openllm

Using CPython 3.12.8 Creating virtual environment at: .venv Resolved 98 packages in 1.36s Prepared 94 packages in 16.38s Installed 94 packages in 244ms

  • a2wsgi==1.10.8
  • aiohappyeyeballs==2.6.1
  • aiohttp==3.11.18
  • aiosignal==1.3.2
  • aiosqlite==0.21.0
  • annotated-types==0.7.0
  • anyio==4.9.0
  • appdirs==1.4.4
  • asgiref==3.8.1
  • attrs==25.3.0
  • bentoml==1.4.8
  • cattrs==23.1.2
  • certifi==2025.4.26
  • charset-normalizer==3.4.2
  • click==8.1.8
  • click-option-group==0.5.7
  • cloudpickle==3.1.1
  • deprecated==1.2.18
  • distro==1.9.0
  • dulwich==0.22.8
  • filelock==3.18.0
  • frozenlist==1.6.0
  • fs==2.4.16
  • fsspec==2025.3.2
  • h11==0.16.0
  • hf-xet==1.1.0
  • httpcore==1.0.9
  • httpx==0.28.1
  • httpx-ws==0.7.2
  • huggingface-hub==0.30.2
  • idna==3.10
  • importlib-metadata==8.6.1
  • jinja2==3.1.6
  • jiter==0.9.0
  • kantoku==0.18.3
  • markdown-it-py==3.0.0
  • markupsafe==3.0.2
  • mdurl==0.1.2
  • multidict==6.4.3
  • numpy==2.2.5
  • nvidia-ml-py==12.570.86
  • openai==1.73.0
  • openllm==0.6.30
  • opentelemetry-api==1.32.1
  • opentelemetry-instrumentation==0.53b1
  • opentelemetry-instrumentation-aiohttp-client==0.53b1
  • opentelemetry-instrumentation-asgi==0.53b1
  • opentelemetry-sdk==1.32.1
  • opentelemetry-semantic-conventions==0.53b1
  • opentelemetry-util-http==0.53b1
  • packaging==25.0
  • pathspec==0.12.1
  • pip-requirements-parser==32.0.1
  • prometheus-client==0.21.1
  • prompt-toolkit==3.0.51
  • propcache==0.3.1
  • psutil==7.0.0
  • pyaml==25.1.0
  • pydantic==2.11.4
  • pydantic-core==2.33.2
  • pygments==2.19.1
  • pyparsing==3.2.3
  • python-dateutil==2.9.0.post0
  • python-dotenv==1.1.0
  • python-json-logger==3.3.0
  • python-multipart==0.0.20
  • pyyaml==6.0.2
  • pyzmq==26.4.0
  • questionary==2.1.0
  • requests==2.32.3
  • rich==14.0.0
  • schema==0.7.7
  • setuptools==80.3.0
  • shellingham==1.5.4
  • simple-di==0.1.5
  • six==1.17.0
  • sniffio==1.3.1
  • starlette==0.46.2
  • tabulate==0.9.0
  • tomli-w==1.2.0
  • tornado==6.4.2
  • tqdm==4.67.1
  • typer==0.15.3
  • typing-extensions==4.13.2
  • typing-inspection==0.4.0
  • urllib3==2.4.0
  • uv==0.7.2
  • uvicorn==0.34.2
  • watchfiles==1.0.5
  • wcwidth==0.2.13
  • wrapt==1.17.2
  • wsproto==1.2.0
  • yarl==1.20.0
  • zipp==3.21.0

5. Execute openllm hello

# ~/local-gpt/openllm-api
uv run openllm hello

updating repo default updating repo nightly

Logs

Output provided above in "to reproduce" section

Environment

Using local environment, not cloud. All model execution is directly through ollama, but am interrested in llama.cpp as well. Does openLLM require bentoml?

Platform: ollama

System information (Optional)

Enviornment


platform: Windows 11 / WSL2 / Docker Compose (all execution done via WSL2) memory: 128GB Ram gpu: RTX 3080 Shell: bash Package Manager: uv Python Version: CPython 3.12.8

iamobservable avatar May 03 '25 20:05 iamobservable