OpenLLM icon indicating copy to clipboard operation
OpenLLM copied to clipboard

bug: OpenLLM query in WSL failed with a timeout

Open vvvlll93 opened this issue 2 years ago • 6 comments

Describe the bug

Hi,

When trying to query the meta-llama/Llama-2-7b-chat-hf with a simple query (i.e. 'Hello'), the query failed with a timeout.

Just to be clear, i launched in a wsl session using the server openllm start ... command, and in another wsl session i tried to queried it using the openllm query ... command.

I tried to add the flag --timeout 300 to both the start command (i.e. openllm start llama --model-id meta... --timeout 300) and the query command (i.e. openllm query --timeout 300), but it doesn't seems to be taken into account as the query will fail after 30 seconds (which i believe is the default settings).

I don't really get why this failed, because the logs are really not helping (even with the --debug flag). I don't know, maybe the model is to big or something.

Do you have any idea why this happens ? Maybe i'm doing something wrong with my wsl session.

To reproduce

No response

Logs

No response

Environment

accelerate 0.22.0 aiohttp 3.8.5 aiosignal 1.3.1 anyio 4.0.0 appdirs 1.4.4 asgiref 3.7.2 async-timeout 4.0.3 attrs 23.1.0 beautifulsoup4 4.12.2 bentoml 1.1.6 bitsandbytes 0.41.1 boto3 1.28.43 botocore 1.31.43 bpemb 0.3.4 build 1.0.3 cattrs 23.1.2 certifi 2023.7.22 charset-normalizer 3.2.0 circus 0.18.0 click 8.1.7 click-option-group 0.5.6 cloudpickle 2.2.1 colorama 0.4.6 coloredlogs 15.0.1 conllu 4.5.3 contextlib2 21.6.0 contourpy 1.1.0 cuda-python 12.2.0 cycler 0.11.0 Cython 3.0.2 datasets 2.14.5 deepmerge 1.1.0 Deprecated 1.2.14 dill 0.3.7 diskcache 5.6.3 emoji 2.8.0 exceptiongroup 1.1.3 fairscale 0.4.13 fastcore 1.5.29 filelock 3.12.3 filetype 1.2.0 flair 0.12.2 fonttools 4.42.1 frozenlist 1.4.0 fs 2.4.16 fsspec 2023.6.0 ftfy 6.1.1 future 0.18.3 gdown 4.4.0 gensim 4.3.2 ghapi 1.0.4 h11 0.14.0 httpcore 0.17.3 httpx 0.24.1 huggingface-hub 0.16.4 humanfriendly 10.0 hyperopt 0.2.7 idna 3.4 importlib-metadata 6.0.1 importlib-resources 6.0.1 inflection 0.5.1 Janome 0.5.0 Jinja2 3.1.2 jmespath 1.0.1 joblib 1.3.2 kiwisolver 1.4.5 langdetect 1.0.9 llama-cpp-python 0.1.83 lxml 4.9.3 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.7.2 mdurl 0.1.2 more-itertools 10.1.0 mpld3 0.3 mpmath 1.3.0 multidict 6.0.4 multiprocess 0.70.15 mypy-extensions 1.0.0 nereval 0.2.5 nervaluate 0.1.8 networkx 3.1 numpy 1.24.4 openllm 0.3.3 openllm-client 0.3.3 openllm-core 0.3.3 opentelemetry-api 1.18.0 opentelemetry-instrumentation 0.39b0 opentelemetry-instrumentation-aiohttp-client 0.39b0 opentelemetry-instrumentation-asgi 0.39b0 opentelemetry-sdk 1.18.0 opentelemetry-semantic-conventions 0.39b0 opentelemetry-util-http 0.39b0 optimum 1.13.0 orjson 3.9.6 packaging 23.1 pandas 2.0.3 pathspec 0.11.2 Pillow 10.0.0 pip 22.3.1 pip-requirements-parser 32.0.1 pip-tools 7.3.0 pptree 3.1 prometheus-client 0.17.1 protobuf 4.24.2 psutil 5.9.5 py4j 0.10.9.7 pyarrow 13.0.0 pydantic 1.10.12 Pygments 2.16.1 pynvml 11.5.0 pyparsing 3.0.9 pyproject_hooks 1.0.0 pyreadline3 3.4.1 PySocks 1.7.1 python-dateutil 2.8.2 python-json-logger 2.0.7 python-multipart 0.0.6 pytorch_revgrad 0.2.0 pytz 2023.3.post1 PyYAML 6.0.1 pyzmq 25.1.1 regex 2023.8.8 requests 2.31.0 rich 13.5.2 s3transfer 0.6.2 safetensors 0.3.3 schema 0.7.5 scikit-learn 1.3.0 scipy 1.10.1 segtok 1.5.11 sentencepiece 0.1.99 setuptools 65.5.1 simphile 1.0.2 simple-di 0.1.5 six 1.16.0 smart-open 6.4.0 sniffio 1.3.0 soupsieve 2.5 sqlitedict 2.1.0 starlette 0.31.1 sympy 1.12 tabulate 0.9.0 threadpoolctl 3.2.0 tokenizers 0.13.3 tomli 2.0.1 torch 2.0.1 Unidecode 1.3.6 urllib3 1.26.16 uvicorn 0.23.2 watchfiles 0.20.0 wcwidth 0.2.6 wheel 0.38.4 Wikipedia-API 0.6.0 wrapt 1.15.0 xxhash 3.3.0 yarl 1.9.2 zipp 3.16.2

System information (Optional)

No response

vvvlll93 avatar Sep 09 '23 09:09 vvvlll93

I second this, running on wsl2 Ubuntu as well under close-enough conditions that I believe this to be the same problem. It only occurs to me when running heavier models, models such as OPT don't give me this error, but models such as Falcon (and llama) do. (edit) Even passing the timeout thought the Python API results in the same error.

analog-wizard avatar Sep 12 '23 02:09 analog-wizard

I met the same problem when running vicuna-33b model, not problem with vicuna-13b.

oppokui avatar Sep 14 '23 06:09 oppokui

I'm having the same problem when running flan-t5-xl, but not with flan-t5-large. I think it has to do with the size of the model. My setup is similar - running the model from wsl2 and the query from a win 10 python script.

sepiatone avatar Sep 15 '23 14:09 sepiatone

Same for me: openllm query 'Can you name the 7 Harry Potter Books' TimeoutError: timed out -> after 30 seconds

Model: openllm start opt --model-id facebook/opt-30b --timeout 400 -> timeout parameter does not do anything

I changed the value self._sock.settimeout to self._sock.settimeout(3000)

in the file

/home//.local/lib/python3.10/site-packages/httpcore/backends/sync.py

and then it worked.

Reimelt avatar Oct 09 '23 18:10 Reimelt

Ugh can you try again with the latest change? It sounds like you need to update the socket_timeout var in your shell

aarnphm avatar Nov 09 '23 23:11 aarnphm

Running into the same issue but running in a docker container in wsl2

Opswatch avatar Jan 28 '24 03:01 Opswatch

close for openllm 0.6

bojiang avatar Jul 12 '24 01:07 bojiang