OpenLLM
OpenLLM copied to clipboard
bug: openllm start opt or openllm start dolly-v2 faild
Describe the bug
openllm start opt and openllm start dolly-v2 shows OK. when i made the query below came out.
2023-06-21T16:45:40+0800 [INFO] [runner:llm-dolly-v2-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.574ms (trace=100fa96d33433a772259d444a0006ca9,span=4caf6df55eb0c67e,sampled=1,service.name=llm-dolly-v2-runner) 2023-06-21T16:45:40+0800 [INFO] [api_server:llm-dolly-v2-service:9] 127.0.0.1:63613 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 5.220ms (trace=100fa96d33433a772259d444a0006ca9,span=7ec016176efc036d,sampled=1,service.name=llm-dolly-v2-service)
To reproduce
No response
Logs
2023-06-21T16:45:40+0800 [INFO] [runner:llm-dolly-v2-runner:1] _ (scheme=http,method=GET,path=http://127.0.0.1:8000/readyz,type=,length=) (status=404,type=text/plain; charset=utf-8,length=9) 0.574ms (trace=100fa96d33433a772259d444a0006ca9,span=4caf6df55eb0c67e,sampled=1,service.name=llm-dolly-v2-runner)
2023-06-21T16:45:40+0800 [INFO] [api_server:llm-dolly-v2-service:9] 127.0.0.1:63613 (scheme=http,method=GET,path=/readyz,type=,length=) (status=503,type=text/plain; charset=utf-8,length=22) 5.220ms (trace=100fa96d33433a772259d444a0006ca9,span=7ec016176efc036d,sampled=1,service.name=llm-dolly-v2-service)
Environment
Environment variable
BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''
System information
bentoml: 1.0.22
python: 3.8.16
platform: macOS-13.4-arm64-arm-64bit
uid_gid: 501:20
conda: 23.3.1
in_conda_env: True
conda_packages
name: openllm
channels:
- defaults
dependencies:
- ca-certificates=2023.05.30=hca03da5_0
- libcxx=14.0.6=h848a8c0_0
- libffi=3.4.4=hca03da5_0
- ncurses=6.4=h313beb8_0
- openssl=3.0.8=h1a28f6b_0
- pip=23.1.2=py38hca03da5_0
- python=3.8.16=hb885b13_4
- readline=8.2=h1a28f6b_0
- setuptools=67.8.0=py38hca03da5_0
- sqlite=3.41.2=h80987f9_0
- tk=8.6.12=hb8d0fd4_0
- wheel=0.38.4=py38hca03da5_0
- xz=5.4.2=h80987f9_0
- zlib=1.2.13=h5a0b063_0
- pip:
- accelerate==0.20.3
- aiohttp==3.8.4
- aiosignal==1.3.1
- anyio==3.7.0
- appdirs==1.4.4
- asgiref==3.7.2
- async-timeout==4.0.2
- attrs==23.1.0
- bentoml==1.0.22
- build==0.10.0
- cattrs==23.1.2
- certifi==2023.5.7
- charset-normalizer==3.1.0
- circus==0.18.0
- click==8.1.3
- click-option-group==0.5.6
- cloudpickle==2.2.1
- coloredlogs==15.0.1
- contextlib2==21.6.0
- cpm-kernels==1.0.11
- datasets==2.13.0
- deepmerge==1.1.0
- deprecated==1.2.14
- dill==0.3.6
- exceptiongroup==1.1.1
- filelock==3.12.2
- filetype==1.2.0
- frozenlist==1.3.3
- fs==2.4.16
- fsspec==2023.6.0
- grpcio==1.54.2
- grpcio-health-checking==1.48.2
- h11==0.14.0
- httpcore==0.17.2
- httpx==0.24.1
- huggingface-hub==0.15.1
- humanfriendly==10.0
- idna==3.4
- importlib-metadata==6.0.1
- inflection==0.5.1
- jinja2==3.1.2
- markdown-it-py==3.0.0
- markupsafe==2.1.3
- mdurl==0.1.2
- mpmath==1.3.0
- multidict==6.0.4
- multiprocess==0.70.14
- networkx==3.1
- numpy==1.24.3
- openllm==0.1.8
- opentelemetry-api==1.17.0
- opentelemetry-instrumentation==0.38b0
- opentelemetry-instrumentation-aiohttp-client==0.38b0
- opentelemetry-instrumentation-asgi==0.38b0
- opentelemetry-instrumentation-grpc==0.38b0
- opentelemetry-sdk==1.17.0
- opentelemetry-semantic-conventions==0.38b0
- opentelemetry-util-http==0.38b0
- optimum==1.8.8
- orjson==3.9.1
- packaging==23.1
- pandas==2.0.2
- pathspec==0.11.1
- pillow==9.5.0
- pip-requirements-parser==32.0.1
- pip-tools==6.13.0
- prometheus-client==0.17.0
- protobuf==3.20.3
- psutil==5.9.5
- pyarrow==12.0.1
- pydantic==1.10.9
- pygments==2.15.1
- pynvml==11.5.0
- pyparsing==3.1.0
- pyproject-hooks==1.0.0
- python-dateutil==2.8.2
- python-json-logger==2.0.7
- python-multipart==0.0.6
- pytz==2023.3
- pyyaml==6.0
- pyzmq==25.1.0
- regex==2023.6.3
- requests==2.31.0
- rich==13.4.2
- safetensors==0.3.1
- schema==0.7.5
- sentencepiece==0.1.99
- simple-di==0.1.5
- six==1.16.0
- sniffio==1.3.0
- starlette==0.28.0
- sympy==1.12
- tabulate==0.9.0
- tokenizers==0.13.3
- tomli==2.0.1
- torch==2.0.1
- torchvision==0.15.2
- tornado==6.3.2
- tqdm==4.65.0
- transformers==4.30.2
- typing-extensions==4.6.3
- tzdata==2023.3
- urllib3==2.0.3
- uvicorn==0.22.0
- watchfiles==0.19.0
- wcwidth==0.2.6
- wrapt==1.15.0
- xxhash==3.2.0
- yarl==1.9.2
- zipp==3.15.0
prefix: /Users/tim/anaconda3/envs/openllm
pip_packages
accelerate==0.20.3
aiohttp==3.8.4
aiosignal==1.3.1
anyio==3.7.0
appdirs==1.4.4
asgiref==3.7.2
async-timeout==4.0.2
attrs==23.1.0
bentoml==1.0.22
build==0.10.0
cattrs==23.1.2
certifi==2023.5.7
charset-normalizer==3.1.0
circus==0.18.0
click==8.1.3
click-option-group==0.5.6
cloudpickle==2.2.1
coloredlogs==15.0.1
contextlib2==21.6.0
cpm-kernels==1.0.11
datasets==2.13.0
deepmerge==1.1.0
Deprecated==1.2.14
dill==0.3.6
exceptiongroup==1.1.1
filelock==3.12.2
filetype==1.2.0
frozenlist==1.3.3
fs==2.4.16
fsspec==2023.6.0
grpcio==1.54.2
grpcio-health-checking==1.48.2
h11==0.14.0
httpcore==0.17.2
httpx==0.24.1
huggingface-hub==0.15.1
humanfriendly==10.0
idna==3.4
importlib-metadata==6.0.1
inflection==0.5.1
Jinja2==3.1.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
networkx==3.1
numpy==1.24.3
openllm==0.1.8
opentelemetry-api==1.17.0
opentelemetry-instrumentation==0.38b0
opentelemetry-instrumentation-aiohttp-client==0.38b0
opentelemetry-instrumentation-asgi==0.38b0
opentelemetry-instrumentation-grpc==0.38b0
opentelemetry-sdk==1.17.0
opentelemetry-semantic-conventions==0.38b0
opentelemetry-util-http==0.38b0
optimum==1.8.8
orjson==3.9.1
packaging==23.1
pandas==2.0.2
pathspec==0.11.1
Pillow==9.5.0
pip-requirements-parser==32.0.1
pip-tools==6.13.0
prometheus-client==0.17.0
protobuf==3.20.3
psutil==5.9.5
pyarrow==12.0.1
pydantic==1.10.9
Pygments==2.15.1
pynvml==11.5.0
pyparsing==3.1.0
pyproject_hooks==1.0.0
python-dateutil==2.8.2
python-json-logger==2.0.7
python-multipart==0.0.6
pytz==2023.3
PyYAML==6.0
pyzmq==25.1.0
regex==2023.6.3
requests==2.31.0
rich==13.4.2
safetensors==0.3.1
schema==0.7.5
sentencepiece==0.1.99
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.28.0
sympy==1.12
tabulate==0.9.0
tokenizers==0.13.3
tomli==2.0.1
torch==2.0.1
torchvision==0.15.2
tornado==6.3.2
tqdm==4.65.0
transformers==4.30.2
typing_extensions==4.6.3
tzdata==2023.3
urllib3==2.0.3
uvicorn==0.22.0
watchfiles==0.19.0
wcwidth==0.2.6
wrapt==1.15.0
xxhash==3.2.0
yarl==1.9.2
zipp==3.15.0
transformersversion: 4.30.2- Platform: macOS-13.4-arm64-arm-64bit
- Python version: 3.8.16
- Huggingface_hub version: 0.15.1
- Safetensors version: 0.3.1
- PyTorch version (GPU?): 2.0.1 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: no
- Using distributed or parallel set-up in script?:
System information (Optional)
apple m1 max
openllm start dolly-v2 is OK as below
2023-06-21T16:59:44+0800 [INFO] [cli] Environ for worker 0: set CPU thread count to 10
2023-06-21T16:59:45+0800 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service.py:svc" can be accessed at http://localhost:3000/metrics.
2023-06-21T16:59:45+0800 [INFO] [cli] Starting production HTTP BentoServer from "_service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
2023-06-21T16:59:49+0800 [WARNING] [runner:llm-dolly-v2-runner:1] The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
http://localhost:3000/readyz shows: Runners are not ready.
are you seeing this with dolly, or with both?
I have tested OPT on my end, on both linux and macos and it startup just fine
Hey, I have fixed this issue on main and will release a patch version soon.
OK, THANKS.
Hey @aarnphm, I ran into the exact same behavior.
I tried to deploy openllm within a podman container, with registry.redhat.io/ubi8/python-39:latest as base image. Are there plans for containerizing openllm or to support it running in a rootless container?
I tried it on the system (RHEL 8.4) outside of the container with a venv (Python 3.9), the readyz endpoint also indicates Runners are not ready..
Start-Command:
openllm start opt --model-id facebook/opt-125m
Can you dumped the whole stack trace in a new issue?
Hey, I have fixed this issue on main and will release a patch version soon.
Is there a commit or a branch where you can see these changes? I did not find any resources.
Containerizing Bento with podman should already be supported.
See https://docs.bentoml.com/en/latest/guides/containerization.html#containerization-with-different-container-engines
bentoml containerize llm-bento --backend podman --opt ...
Tho there is an internal bug that I just discovered recently wrt running within the container. I will post updates about this soon
Hey @aarnphm, I ran into the exact same behavior. I tried to deploy openllm within a podman container, with registry.redhat.io/ubi8/python-39:latest as base image. Are there plans for containerizing openllm or to support it running in a rootless container?
I believe this is related to the container deployment. can you create a new issue? thanks