vllm Custom base path on FastAPI server

Hello everyone,

We have a situation here in which we are deploying our LLMs behind a ALB on AWS using path prefix, in this case it is deployed on a path that looks like my.domain.com/my-model. By default ALBs on AWS doesn't support path rewrite so it will forward the request to the API server including the /my-model on it, breaking the API because this URL doesn't exist there given that the API only listens in the /generate path. I implemented a custom API server here, but I think we could add an optional parameter (or ENV VAR) that we can use to set the base path of the server (or keep the default behavior if it is not there). I can open a PR doing this, but I wanted to confirm if this is something that makes sense for other people before I actually do this.

Sep 29 '23 12:09 thiagosalvatore

any update here?

Oct 19 '23 18:10 thiagosalvatore

FastAPI supports this through the --root-path parameter, but it'd have to be exposed via fastAPI.

https://fastapi.tiangolo.com/advanced/behind-a-proxy/

Rand into the same issue. Here's how I run a similar fastAPI service behind ALB

      command: [ "uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000", "--root-path", "/my-model" ]

Would need to get this parameter propogated intot he FastAPI server.

I'm considering just biting the bullet and standing up a second loadbalancer just to serve VLLM because AWS sucks that much

Nov 17 '23 22:11 hughesadam87

@hughesadam87 I've been using kubernetes so I just changed my route to be my-model.mydomain.com instead of api.mydomain.com/my-model. No need to create another ALB. But it is a pain in the ass because I didn't want to have one full domain for each model. Another option is to do what I did and to create a custom python server that is identical to the one used by vLLM and just change the routes to have your prefix, it worked too.

Ideally we could just have an env var on VLLM that adds the prefix to the routes when set

Jan 04 '24 14:01 thiagosalvatore

vllm vllm copied to clipboard

Custom base path on FastAPI server

vllm
vllm copied to clipboard