mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

[Bug] Python server example runs but hangs on prefill function within api call

Open surya-ven opened this issue 1 year ago • 3 comments

🐛 Bug

I managed to get the python server (located under mlc-llm/python) to work by first building both tvm and mlc-llm cli from source, and then running the command: python -m mlc_chat.rest --artifact-path ../dist --model vicuna-v1-7b --quantization q3f16_0 --device-name metal. Which then successfully starts the server:

INFO:     Will watch for changes in these directories: ['<some path>/mlc-llm/python']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [26676] using StatReload
INFO:     Started server process [26688]
INFO:     Waiting for application startup.
[20:47:46] <some path>/tvm-unity-nightly/src/runtime/metal/metal_device_api.mm:165: Intializing Metal device 0, name=Apple M2 Pro
INFO:     Application startup complete.

When I call the endpoint "/chat/completions" with the a correctly formatted post request, the function hangs on the session["chat_mod"].prefill(input=request.prompt) call.

I tested the same code within the lifespan function (called on server startup) and everything works correctly there. I also checked that session["chat_mod"] was not empty within api function for "/chat/completions".

To Reproduce

Steps to reproduce the behavior:

  1. Run the python server
  2. Call the endpoint "/chat/completions" with the a correctly formatted post request

Expected behavior

Chat response returned from api call to the endpoint "/chat/completions".

Environment

  • Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): Metal
  • Operating system (e.g. Ubuntu/Windows/MacOS/...): MacOS
  • Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): M2 MacBook Pro
  • How you installed MLC-LLM (conda, source): source
  • How you installed TVM-Unity (pip, source): source
  • Python version (e.g. 3.10): 3.11.3

surya-ven avatar Jun 04 '23 10:06 surya-ven

CC: @sudeepag @Kathryn-cat

junrushao avatar Jun 06 '23 12:06 junrushao

Hi @surya-ven, a couple of questions:

  1. Does running python sample_client.py work for you? Could you paste the output here?
  2. Are you able to run the CLI successfully? ./build/mlc_chat_cli

sudeepag avatar Jun 09 '23 03:06 sudeepag

Hi @surya-ven, a couple of questions:

  1. Does running python sample_client.py work for you? Could you paste the output here?
  2. Are you able to run the CLI successfully? ./build/mlc_chat_cli
  1. It doesn't work, this is the output:
Supported models: {'data': [{'id': 'rwkv-raven-7b', 'object': 'model'}, {'id': 'RedPajama-INCITE-Chat-3B-v1', 'object': 'model'}, {'id': 'vicuna-v1-7b', 'object': 'model'}]}

Traceback (most recent call last):
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/http/client.py", line 1375, in getresponse
    response.begin()
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/requests/adapters.py", line 487, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/http/client.py", line 1375, in getresponse
    response.begin()
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "<path>/mlc-llm/python/mlc_chat/sample_client.py", line 15, in <module>
    r = requests.post("http://127.0.0.1:8000/chat/completions", json=payload)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/requests/api.py", line 115, in post
    return request("post", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/suryaven/miniconda3/envs/tvm-unity-build/lib/python3.11/site-packages/requests/adapters.py", line 502, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
  1. I'm able to run the CLI correctly. I've even managed to run it using python code adapted from some of the code in rest.py without the use of a server. It's only when I run it as a server, as intended in rest.py, that I encounter problems.

surya-ven avatar Jun 10 '23 02:06 surya-ven

@surya-ven I'm not quite sure why this is happening. Could you try pulling the latest and maybe the steps here? Does the server output any logs when you submit the chat completion POST?

sudeepag avatar Jun 13 '23 00:06 sudeepag

Closing due to the fix in #469. Please open a new issue if you are still running into this.

sudeepag avatar Jun 28 '23 17:06 sudeepag

Thanks for looking into this, sorry I haven't had time to check, will try soon and open an issue/try find a fix if it still persists.

surya-ven avatar Jun 29 '23 03:06 surya-ven