Refactor LLMProxy to run Uvicorn in isolated process
Summary
This PR refactors the LLMProxy.start() logic to launch the Uvicorn proxy server in a fully isolated process using multiprocessing.spawn.
The previous implementation ran the server in a background thread, which caused persistent connection and transport errors when Ray forked workers or when LiteLLM reused existing event loops.
Motivation
In the previous design, the proxy shared its asyncio loop and open network sockets with the parent process. When Ray forked new workers or reused existing aiohttp.ClientSession objects, the same TCP socket descriptors were inherited in an invalid state. This led to repeated I/O failures like:
aiohttp.client_exceptions.ClientConnectionResetError: Cannot write to closing transport
openai.APIConnectionError: Connection error.
litellm.llms.openai.common_utils.OpenAIError: Connection error.
These errors occurred because aiohttp tried to write to a “closing transport”, a socket that had already been closed in the parent process or invalidated by fork().
Changes
- ✅Run the LLMProxy Uvicorn server in a dedicated subprocess (spawn mode) to ensure a clean event loop and socket set.
- ✅Replace the thread-based readiness logic with a socket-level health check.
- ✅Cleanly isolate OpenTelemetry and aiohttp sessions between parent and child processes.