OpenLLM bug: openllm start opt or openllm start dolly-v2 faild

Describe the bug

openllm start opt and openllm start dolly-v2 shows OK. when i made the query below came out.

2023-06-21T16:45:40+0800 [INFO] [runner:llm-dolly-v2-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.574ms (trace=100fa96d33433a772259d444a0006ca9,span=4caf6df55eb0c67e,sampled=1,service.name=llm-dolly-v2-runner) 2023-06-21T16:45:40+0800 [INFO] [api_server:llm-dolly-v2-service:9] 127.0.0.1:63613 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 5.220ms (trace=100fa96d33433a772259d444a0006ca9,span=7ec016176efc036d,sampled=1,service.name=llm-dolly-v2-service)

To reproduce

No response

Logs

2023-06-21T16:45:40+0800 [INFO] [runner:llm-dolly-v2-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.574ms (trace=100fa96d33433a772259d444a0006ca9,span=4caf6df55eb0c67e,sampled=1,service.name=llm-dolly-v2-runner)
2023-06-21T16:45:40+0800 [INFO] [api_server:llm-dolly-v2-service:9] 127.0.0.1:63613 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 5.220ms (trace=100fa96d33433a772259d444a0006ca9,span=7ec016176efc036d,sampled=1,service.name=llm-dolly-v2-service)

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.0.22 python: 3.8.16 platform: macOS-13.4-arm64-arm-64bit uid_gid: 501:20 conda: 23.3.1 in_conda_env: True

conda_packages

name: openllm
channels:
  - defaults
dependencies:
  - ca-certificates=2023.05.30=hca03da5_0
  - libcxx=14.0.6=h848a8c0_0
  - libffi=3.4.4=hca03da5_0
  - ncurses=6.4=h313beb8_0
  - openssl=3.0.8=h1a28f6b_0
  - pip=23.1.2=py38hca03da5_0
  - python=3.8.16=hb885b13_4
  - readline=8.2=h1a28f6b_0
  - setuptools=67.8.0=py38hca03da5_0
  - sqlite=3.41.2=h80987f9_0
  - tk=8.6.12=hb8d0fd4_0
  - wheel=0.38.4=py38hca03da5_0
  - xz=5.4.2=h80987f9_0
  - zlib=1.2.13=h5a0b063_0
  - pip:
      - accelerate==0.20.3
      - aiohttp==3.8.4
      - aiosignal==1.3.1
      - anyio==3.7.0
      - appdirs==1.4.4
      - asgiref==3.7.2
      - async-timeout==4.0.2
      - attrs==23.1.0
      - bentoml==1.0.22
      - build==0.10.0
      - cattrs==23.1.2
      - certifi==2023.5.7
      - charset-normalizer==3.1.0
      - circus==0.18.0
      - click==8.1.3
      - click-option-group==0.5.6
      - cloudpickle==2.2.1
      - coloredlogs==15.0.1
      - contextlib2==21.6.0
      - cpm-kernels==1.0.11
      - datasets==2.13.0
      - deepmerge==1.1.0
      - deprecated==1.2.14
      - dill==0.3.6
      - exceptiongroup==1.1.1
      - filelock==3.12.2
      - filetype==1.2.0
      - frozenlist==1.3.3
      - fs==2.4.16
      - fsspec==2023.6.0
      - grpcio==1.54.2
      - grpcio-health-checking==1.48.2
      - h11==0.14.0
      - httpcore==0.17.2
      - httpx==0.24.1
      - huggingface-hub==0.15.1
      - humanfriendly==10.0
      - idna==3.4
      - importlib-metadata==6.0.1
      - inflection==0.5.1
      - jinja2==3.1.2
      - markdown-it-py==3.0.0
      - markupsafe==2.1.3
      - mdurl==0.1.2
      - mpmath==1.3.0
      - multidict==6.0.4
      - multiprocess==0.70.14
      - networkx==3.1
      - numpy==1.24.3
      - openllm==0.1.8
      - opentelemetry-api==1.17.0
      - opentelemetry-instrumentation==0.38b0
      - opentelemetry-instrumentation-aiohttp-client==0.38b0
      - opentelemetry-instrumentation-asgi==0.38b0
      - opentelemetry-instrumentation-grpc==0.38b0
      - opentelemetry-sdk==1.17.0
      - opentelemetry-semantic-conventions==0.38b0
      - opentelemetry-util-http==0.38b0
      - optimum==1.8.8
      - orjson==3.9.1
      - packaging==23.1
      - pandas==2.0.2
      - pathspec==0.11.1
      - pillow==9.5.0
      - pip-requirements-parser==32.0.1
      - pip-tools==6.13.0
      - prometheus-client==0.17.0
      - protobuf==3.20.3
      - psutil==5.9.5
      - pyarrow==12.0.1
      - pydantic==1.10.9
      - pygments==2.15.1
      - pynvml==11.5.0
      - pyparsing==3.1.0
      - pyproject-hooks==1.0.0
      - python-dateutil==2.8.2
      - python-json-logger==2.0.7
      - python-multipart==0.0.6
      - pytz==2023.3
      - pyyaml==6.0
      - pyzmq==25.1.0
      - regex==2023.6.3
      - requests==2.31.0
      - rich==13.4.2
      - safetensors==0.3.1
      - schema==0.7.5
      - sentencepiece==0.1.99
      - simple-di==0.1.5
      - six==1.16.0
      - sniffio==1.3.0
      - starlette==0.28.0
      - sympy==1.12
      - tabulate==0.9.0
      - tokenizers==0.13.3
      - tomli==2.0.1
      - torch==2.0.1
      - torchvision==0.15.2
      - tornado==6.3.2
      - tqdm==4.65.0
      - transformers==4.30.2
      - typing-extensions==4.6.3
      - tzdata==2023.3
      - urllib3==2.0.3
      - uvicorn==0.22.0
      - watchfiles==0.19.0
      - wcwidth==0.2.6
      - wrapt==1.15.0
      - xxhash==3.2.0
      - yarl==1.9.2
      - zipp==3.15.0
prefix: /Users/tim/anaconda3/envs/openllm

pip_packages

accelerate==0.20.3
aiohttp==3.8.4
aiosignal==1.3.1
anyio==3.7.0
appdirs==1.4.4
asgiref==3.7.2
async-timeout==4.0.2
attrs==23.1.0
bentoml==1.0.22
build==0.10.0
cattrs==23.1.2
certifi==2023.5.7
charset-normalizer==3.1.0
circus==0.18.0
click==8.1.3
click-option-group==0.5.6
cloudpickle==2.2.1
coloredlogs==15.0.1
contextlib2==21.6.0
cpm-kernels==1.0.11
datasets==2.13.0
deepmerge==1.1.0
Deprecated==1.2.14
dill==0.3.6
exceptiongroup==1.1.1
filelock==3.12.2
filetype==1.2.0
frozenlist==1.3.3
fs==2.4.16
fsspec==2023.6.0
grpcio==1.54.2
grpcio-health-checking==1.48.2
h11==0.14.0
httpcore==0.17.2
httpx==0.24.1
huggingface-hub==0.15.1
humanfriendly==10.0
idna==3.4
importlib-metadata==6.0.1
inflection==0.5.1
Jinja2==3.1.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
networkx==3.1
numpy==1.24.3
openllm==0.1.8
opentelemetry-api==1.17.0
opentelemetry-instrumentation==0.38b0
opentelemetry-instrumentation-aiohttp-client==0.38b0
opentelemetry-instrumentation-asgi==0.38b0
opentelemetry-instrumentation-grpc==0.38b0
opentelemetry-sdk==1.17.0
opentelemetry-semantic-conventions==0.38b0
opentelemetry-util-http==0.38b0
optimum==1.8.8
orjson==3.9.1
packaging==23.1
pandas==2.0.2
pathspec==0.11.1
Pillow==9.5.0
pip-requirements-parser==32.0.1
pip-tools==6.13.0
prometheus-client==0.17.0
protobuf==3.20.3
psutil==5.9.5
pyarrow==12.0.1
pydantic==1.10.9
Pygments==2.15.1
pynvml==11.5.0
pyparsing==3.1.0
pyproject_hooks==1.0.0
python-dateutil==2.8.2
python-json-logger==2.0.7
python-multipart==0.0.6
pytz==2023.3
PyYAML==6.0
pyzmq==25.1.0
regex==2023.6.3
requests==2.31.0
rich==13.4.2
safetensors==0.3.1
schema==0.7.5
sentencepiece==0.1.99
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.28.0
sympy==1.12
tabulate==0.9.0
tokenizers==0.13.3
tomli==2.0.1
torch==2.0.1
torchvision==0.15.2
tornado==6.3.2
tqdm==4.65.0
transformers==4.30.2
typing_extensions==4.6.3
tzdata==2023.3
urllib3==2.0.3
uvicorn==0.22.0
watchfiles==0.19.0
wcwidth==0.2.6
wrapt==1.15.0
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0

transformers version: 4.30.2
Platform: macOS-13.4-arm64-arm-64bit
Python version: 3.8.16
Huggingface_hub version: 0.15.1
Safetensors version: 0.3.1
PyTorch version (GPU?): 2.0.1 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: no
Using distributed or parallel set-up in script?:

System information (Optional)

apple m1 max

Jun 21 '23 08:06 hurner

openllm start dolly-v2 is OK as below 2023-06-21T16:59:44+0800 [INFO] [cli] Environ for worker 0: set CPU thread count to 10 2023-06-21T16:59:45+0800 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service.py:svc" can be accessed at http://localhost:3000/metrics. 2023-06-21T16:59:45+0800 [INFO] [cli] Starting production HTTP BentoServer from "_service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit) 2023-06-21T16:59:49+0800 [WARNING] [runner:llm-dolly-v2-runner:1] The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

Jun 21 '23 09:06 hurner

http://localhost:3000/readyz shows: Runners are not ready.

Jun 21 '23 09:06 hurner

are you seeing this with dolly, or with both?

I have tested OPT on my end, on both linux and macos and it startup just fine

Jun 21 '23 10:06 aarnphm

Hey, I have fixed this issue on main and will release a patch version soon.

Jun 24 '23 01:06 aarnphm

OK， THANKS.

Jun 26 '23 01:06 hurner

Hey @aarnphm, I ran into the exact same behavior.
I tried to deploy openllm within a podman container, with registry.redhat.io/ubi8/python-39:latest as base image. Are there plans for containerizing openllm or to support it running in a rootless container?

Jun 26 '23 15:06 VfBfoerst

I tried it on the system (RHEL 8.4) outside of the container with a venv (Python 3.9), the readyz endpoint also indicates Runners are not ready..
Start-Command:
openllm start opt --model-id facebook/opt-125m

Jun 26 '23 15:06 VfBfoerst

Can you dumped the whole stack trace in a new issue?

Jun 26 '23 20:06 aarnphm

Hey, I have fixed this issue on main and will release a patch version soon.

Is there a commit or a branch where you can see these changes? I did not find any resources.

Jun 27 '23 11:06 VfBfoerst

Containerizing Bento with podman should already be supported.

See https://docs.bentoml.com/en/latest/guides/containerization.html#containerization-with-different-container-engines


bentoml containerize llm-bento --backend podman --opt ...

Jun 27 '23 12:06 aarnphm

Tho there is an internal bug that I just discovered recently wrt running within the container. I will post updates about this soon

Jun 27 '23 12:06 aarnphm

Hey @aarnphm, I ran into the exact same behavior. I tried to deploy openllm within a podman container, with registry.redhat.io/ubi8/python-39:latest as base image. Are there plans for containerizing openllm or to support it running in a rootless container?

I believe this is related to the container deployment. can you create a new issue? thanks

Jun 27 '23 12:06 aarnphm

OpenLLM OpenLLM copied to clipboard

bug: openllm start opt or openllm start dolly-v2 faild

Describe the bug

To reproduce

Logs

Environment

Environment variable

System information

System information (Optional)

OpenLLM
OpenLLM copied to clipboard