MLServer icon indicating copy to clipboard operation
MLServer copied to clipboard

Latest version of `uvloop` runtime error

Open Kimdongui opened this issue 2 months ago • 4 comments

Problem

  • Accoding to updating uvloop at 2025-10-17 (version 0.21.0 -> 0.22.0), mlserver start command gets error below
2025-10-28 01:50:05,945 [mlserver.parallel] DEBUG - Starting response processing loop...
2025-10-28 01:50:05,947 [mlserver.rest] INFO - HTTP server running on http://0.0.0.0:8080
INFO:     Started server process [981]
INFO:     Waiting for application startup.
2025-10-28 01:50:05,963 [mlserver.metrics] INFO - Metrics server running on http://0.0.0.0:8082
2025-10-28 01:50:05,963 [mlserver.metrics] INFO - Prometheus scraping endpoint can be accessed on http://0.0.0.0:8082/metrics
INFO:     Started server process [981]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
2025-10-28 01:50:05,966 [mlserver.grpc] INFO - gRPC server running on http://0.0.0.0:8081
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:     Uvicorn running on http://0.0.0.0:8082 (Press CTRL+C to quit)
Process Worker-1:
Traceback (most recent call last):
  File "/opt/conda/envs/mlflow-env/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlserver/parallel/worker.py", line 70, in run
    self._ignore_signals()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlserver/parallel/worker.py", line 81, in _ignore_signals
    loop = asyncio.get_event_loop()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/uvloop/__init__.py", line 206, in get_event_loop
    raise RuntimeError(
RuntimeError: There is no current event loop in thread 'MainThread'.
2025-10-28 01:50:07,488 [mlserver.parallel] WARNING - Worker with PID 1032 on default inference pool stopped unexpectedly with exit code 256. Triggering worker restart...
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1761616207.491012     981 fork_posix.cc:71] Other threads are currently calling into gRPC, skipping fork() handlers
2025-10-28 01:50:07,497 [mlserver.parallel] INFO - Starting new worker with PID 1105 on default inference pool...
2025-10-28 01:50:07,498 [mlserver.parallel] INFO - New worker with PID 1105 on default inference pool is now ready.
Process Worker-2:
Traceback (most recent call last):
  File "/opt/conda/envs/mlflow-env/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlserver/parallel/worker.py", line 70, in run
    self._ignore_signals()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlserver/parallel/worker.py", line 81, in _ignore_signals
    loop = asyncio.get_event_loop()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/uvloop/__init__.py", line 206, in get_event_loop
    raise RuntimeError(
RuntimeError: There is no current event loop in thread 'MainThread'.
2025-10-28 01:50:08,977 [mlserver.parallel] WARNING - Worker with PID 1105 on default inference pool stopped unexpectedly with exit code 256. Triggering worker restart...
I0000 00:00:1761616208.980069     981 fork_posix.cc:71] Other threads are currently calling into gRPC, skipping fork() handlers
2025-10-28 01:50:08,985 [mlserver.parallel] INFO - Starting new worker with PID 1188 on default inference pool...
2025-10-28 01:50:08,986 [mlserver.parallel] INFO - New worker with PID 1188 on default inference pool is now ready.
Process Worker-3:
Traceback (most recent call last):
  File "/opt/conda/envs/mlflow-env/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlserver/parallel/worker.py", line 70, in run
    self._ignore_signals()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/mlserver/parallel/worker.py", line 81, in _ignore_signals
    loop = asyncio.get_event_loop()
  File "/opt/conda/envs/mlflow-env/lib/python3.9/site-packages/uvloop/__init__.py", line 206, in get_event_loop
    raise RuntimeError(
RuntimeError: There is no current event loop in thread 'MainThread'.
2025-10-28 01:50:10,395 [mlserver.parallel] WARNING - Worker with PID 1188 on default inference pool stopped unexpectedly with exit code 256. Triggering worker restart...

How to solve

I think there are 2 options to resolve this problem

  • Make uvloop version limitation <0.21.0
  • Fix mlserver to follow new uvloop usage

Kimdongui avatar Oct 28 '25 01:10 Kimdongui

Seems like it's due to https://github.com/MagicStack/uvloop/issues/702. I got around this by pinning uvloop to 0.21.0 in the model's virtual env (it was using 0.22.1). The version of uvloop used by the MLServer image did not seem to matter from my testing (using minikube). FWIW, I am running MLServer with multi-model serving with environment_tarball specified for the model.

YuloLeake avatar Oct 29 '25 14:10 YuloLeake

@YuloLeake Thank you for sharing your experience. To make sure I understood correctly, Could you check envrionments below?

  • You have several models with each tarball environment file.
  • Every tar has uvloop<=0.21.0 depedency.

I used this serving runtime to run my mlflow featured model (excatly saved as mlflow.<some_framework>.log_model function) Then I got this problems after installing dependency with tarball without specific uvloop option.

The version of uvloop used by the MLServer image did not seem to matter from my testing

I m curious how uvloop==0.22.1 version never affects your runtime . If I have some misunderstanding about content, please comment. thank you

Kimdongui avatar Nov 11 '25 06:11 Kimdongui

@Kimdongui Looks like I was mistaking. After rerunning mlserver locally, it does appear that specifying uvloop<=0.21.0 for the main process venv does break it (regardless of what the model's venv have it set). Pinning uvloop to 0.21.0 for the main process venv fixes the issue (as expected).

YuloLeake avatar Nov 19 '25 19:11 YuloLeake

For reference, this is related to [#2285]. Will be fixed in MLServer 1.7.2, and we also recommend pinning uvloop to 0.21.0 until the patch release is out.

lc525 avatar Nov 21 '25 13:11 lc525