fastapi icon indicating copy to clipboard operation
fastapi copied to clipboard

No objects ever released by the GC, potential memory leak?

Open akullpp opened this issue 2 years ago • 19 comments

First Check

  • [X] I added a very descriptive title to this issue.
  • [X] I used the GitHub search to find a similar issue and didn't find it.
  • [X] I searched the FastAPI documentation, with the integrated search.
  • [X] I already searched in Google "How to X in FastAPI" and didn't find any information.
  • [X] I already read and followed all the tutorial in the docs and didn't find an answer.
  • [X] I already checked if it is not related to FastAPI but to Pydantic.
  • [X] I already checked if it is not related to FastAPI but to Swagger UI.
  • [X] I already checked if it is not related to FastAPI but to ReDoc.

Commit to Help

  • [X] I commit to help with one of those options 👆

Example Code

# From the official documentation
# Run with uvicorn main:app --reload
from fastapi import FastAPI

app = FastAPI()


@app.get("/")
async def root():
    return {"message": "Hello World"}

Description

Use the minimal example provided in the documentation, call the API 1M times. You will see that the memory usage piles up and up but never goes down. The GC can't free any objects. It's very noticeable once you have a real use case like a file upload that DoS'es your service.

memory-profile

Here some examples from a real service in k8s via lens metrics:

Screenshot 2022-03-04 at 16 47 18 Screenshot 2022-03-03 at 20 19 22

Operating System

Linux, macOS

Operating System Details

No response

FastAPI Version

0.74.1

Python Version

Python 3.10.1

Additional Context

No response

akullpp avatar Mar 04 '22 15:03 akullpp

Are you using --reload as part of your entrypoint like the comment at the top of the code block indicates? What limits if any are you running with in Kube? Those graphs don't look like leaks, they look like constant memory usage to me so my interpretation of what might be happening is that upon application instance initialization some objects are being loaded into memory and those are never being released for whatever reason, likely they're being used by the application itself. These could be connections of various sorts, some data that is getting served, hard to say without seeing the actual production service. Memory leaks usually look like a somewhat constant increase in memory usage until a threshold is breached and then the service is OOM'd.

n8sty avatar Mar 04 '22 16:03 n8sty

What uvicorn version are you using?

Do you have a health check that sends an TCP ping?

If answers above are: "not the latest" and "yes", then bump uvicorn to the latest one.

Kludex avatar Mar 06 '22 11:03 Kludex

Is the application running in the docker container? In the container, python recognizes the memory and CPUs of the host, not the resources limited of the container, which may cause the GC not to be actually executed.

Similar problems have occurred in my application before. I solved them with reference to this issue: https://github.com/tiangolo/fastapi/issues/596#issuecomment-635184641

jagerzhang avatar Mar 11 '22 05:03 jagerzhang

I have solved this issue with following settings:

  • python=3.8.9
  • fastapi=0.63.0
  • uvicorn=0.17.6
  • uvloop=0.16.0

yusufcakmakk avatar Mar 16 '22 17:03 yusufcakmakk

I have no such problem on windows x64

  • python ~= 3.9.0
  • fastapi ~= 0.75.0
  • uvicorn ~= 0.17.0 image

yinziyan1206 avatar Mar 17 '22 02:03 yinziyan1206

Running on docker in python:3.8-slim-buster, so currently using python 3.8.13.

I didn't have memory leak issues with fastapi 0.65.2 and uvicorn 0.14.0 in my project before. My update to fastapi 0.75.0 and uvicorn 0.17.6 then caused my container to keep running into memory problems. Fastapi 0.65.2 and uvicorn 0.17.6 together does not appear to have a memory leak for me.

I then did a binary search of different fastapi versions (using uvicorn 0.17.6) to see where the memory leaks first appear. For me, that is version 0.69.0.

darkclouder avatar Mar 23 '22 08:03 darkclouder

0.69.0 was the introduction of AnyIO on FastAPI.

Release notes: https://fastapi.tiangolo.com/release-notes/#0690

Kludex avatar Mar 23 '22 09:03 Kludex

I tested using uvicorn 0.17.6 and both FastAPI 0.68.2 and 0.75.0. On 0.68.2, memory usage settled on 358 MB after 1M requests, and on 0.75.0, it was 359 MB. Is there something surprising about these results?

agronholm avatar Mar 23 '22 10:03 agronholm

I can't exactly say, my container is limited to 512MiB and the base consumption of my app before was already ~220 MiB, so having an additional 350 MiB and then for it to settle would be well within what I can observe. It's just that for me prior to 0.69.0, I don't have any sharp memory increase at all:

fastapi memory different versions

darkclouder avatar Mar 23 '22 10:03 darkclouder

Can anybody else reproduce these results?

agronholm avatar Mar 23 '22 10:03 agronholm

How do we go about this? The issue is marked as question but the memory leak certainly is a problem for me for updating fastapi. New ticket as a "problem"?

darkclouder avatar May 24 '22 08:05 darkclouder

To start with... People need to reply @agronholm's question.

Kludex avatar May 24 '22 08:05 Kludex

I definitely have this same memory behaviour in some of my more complex services, i.e. memory utilization just keeps climbing and seemingly nothing is ever released, but I haven't been able to reduce it to a simple service that displays the same memory behaviour.

Atheuz avatar May 25 '22 18:05 Atheuz

Not sure if directly related but i detected a leak when saving objects to the request state. the following code will retain the large array in memory even after the request was handled:

from fastapi import FastAPI, Request

app = FastAPI()


@app.get("/")
async def hello(request: Request):
    request.state.test = [x for x in range(999999)]
    return {"Hello": "World"}

a working workaround is to null the request.state.test variable when done. It doesn't matter, for the sake of the leak, if the endpoint is async or not.

The complete example with a test script can be found here: https://github.com/galkahana/fastapi-state-leak

I'll note in addition that I tried to run this code with older versions of FastAPI and got the same results (even if i went as far as 0.65.2 as was suggested in an earlier note). hence...not sure it's directly related.

galkahana avatar Jul 07 '22 07:07 galkahana

In my case where I'm seeing it, I'm attaching a kafka producer to the request.app variable (i.e. request.app.kafka_producer) and then using that in endpoints. If request.state is causing this issue, then I expect that what I'm doing is also causing the issue on my end.

My questionthen is how do I create a kafka producer on startup that's accessible to endpoints without causing this leak issue: I want to avoid creating a new kafka producer on every single request because is really inefficient as startup of a kafka producer takes some time.

Atheuz avatar Jul 08 '22 20:07 Atheuz

@Atheuz wouldn't be so sure that your use case creates a memory leak. in the case i'm showing a new object is created for every request, which is what grows the memory usage for each request. As long as you stick to the same object you should be fine.

galkahana avatar Jul 09 '22 05:07 galkahana

Well that is indeed strange behaviour. I found that it is not just FastAPI, this actually manifest itself in Starlette directly. I rewrote your main.py to test this:

from starlette.applications import Starlette
from starlette.responses import JSONResponse
from starlette.routing import Route
from starlette.requests import Request


async def homepage(request: Request):
    request.state.test = [x for x in range(999999)]
    return JSONResponse({'hello': 'world'})


app = Starlette(routes=[
    Route('/', homepage),
])

And this goes crazy on the memory as well. When assigning the big object to a random variable, the memory usage remains normal. I would recommend to raise this in the Starlette repo, fundamentally the fix must be implemented in that code base anyway.

JarroVGIT avatar Jul 09 '22 17:07 JarroVGIT

Cheers @JarroVGIT. Started a discussion there. Used your example, hope you don't mind.

galkahana avatar Jul 10 '22 06:07 galkahana

The memory leak in uvicorn is probably not the cause of my issue though. First of all, it only happens with FastAPI >=0.69.0, and I also had apps where that happens where I don't even use app.state or request.state at all. I think I will put some more effort in isolating a minimal running version with that memory leak for my case. I'll get back to you if I manage to do that.

darkclouder avatar Jul 29 '22 05:07 darkclouder

Can anybody else reproduce these results?

@agronholm @Kludex Memory leaks in this case. Something weird is that if I change the range(3) to range(2) in the server code, memory stops leaking.

Env Setting

  • ubuntu 16.04
  • python 3.7.13
  • fastapi 0.75.2
  • uvicorn 0.17.6

Server

from fastapi import FastAPI, APIRouter

from starlette.middleware.base import (
    BaseHTTPMiddleware,
    RequestResponseEndpoint,
)
from starlette.requests import Request


class Middleware(BaseHTTPMiddleware):

    async def dispatch(self, req: Request, call_next: RequestResponseEndpoint):
        return await call_next(req)


router = APIRouter()


@router.get("/_ping")
async def ping():
    return "pong"


app = FastAPI()
app.include_router(router)
for _ in range(3):
    app.add_middleware(Middleware)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="127.0.0.1", port=14000)

Client

import requests
import multiprocessing as mp

def do():
    while 1:
        rsp = requests.get("http://127.0.0.1:14000/_ping")
        print(rsp.status_code)

for _ in range(20):
    p = mp.Process(target=do, daemon=True)
    p.start()

import time
time.sleep(1000000)

zhm9484 avatar Sep 01 '22 11:09 zhm9484

Would you mind bumping uvicorn and fastapi to the latest version, and confirm the leak still exists there? If yes, I'll take a look.

Kludex avatar Sep 01 '22 11:09 Kludex

Would you mind bumping uvicorn and fastapi to the latest version, and confirm the leak still exists there? If yes, I'll take a look.

@Kludex Thanks for reply ! I just tried it with fastapi[0.81.0] and uvicorn[0.18.3], and the leak still exist.

zhm9484 avatar Sep 01 '22 11:09 zhm9484

I cannot reproduce the leak. Can you share your results and tooling?

Kludex avatar Sep 02 '22 06:09 Kludex

It would also help to test if you can reproduce the problem on Starlette alone.

agronholm avatar Sep 02 '22 09:09 agronholm

I cannot reproduce the leak. Can you share your results and tooling?

This is the dockerfile that can reproduce the leak.

FROM continuumio/anaconda3:2019.07

SHELL ["/bin/bash", "--login", "-c"]

RUN apt update && \
    apt install -y procps \
                   vim

RUN pip install fastapi==0.81.0 \
                uvicorn==0.18.3

WORKDIR /home/root/leak

COPY client.py client.py
COPY server.py server.py

Run the following commands, and the python server.py process should be leaking.

docker build -t leak-debug:latest -f Dockerfile .
docker run -it leak-debug:latest bash

# in container
nohup python server.py &
nohup python client.py &
top

The memory goes to 1GB in about 3mins.

企业微信截图_300612f4-6e66-4468-bbb3-8988b1673b11

zhm9484 avatar Sep 03 '22 02:09 zhm9484

It would also help to test if you can reproduce the problem on Starlette alone.

@agronholm Thanks. The code below also leaks.

# server.py
from starlette.applications import Starlette
from starlette.middleware import Middleware
from starlette.routing import Route
from starlette.middleware.base import (
    BaseHTTPMiddleware,
    RequestResponseEndpoint,
)
from starlette.requests import Request
from starlette.responses import PlainTextResponse


class TestMiddleware(BaseHTTPMiddleware):

    async def dispatch(self, req: Request, call_next: RequestResponseEndpoint):
        return await call_next(req)


async def ping(request):
    return PlainTextResponse("pong")


app = Starlette(
    routes=[Route("/_ping", endpoint=ping)],
    middleware=[Middleware(TestMiddleware)] * 3,
)


if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="127.0.0.1", port=14000)

zhm9484 avatar Sep 03 '22 02:09 zhm9484

That Dockerfile won't build for me:

 => ERROR [2/6] RUN apt-get update &&     apt-get install -y procps                    vim                         1.4s
------                                                                                                                  
 > [2/6] RUN apt-get update &&     apt-get install -y procps                    vim:                                    
#6 0.429 Get:1 http://deb.debian.org/debian buster InRelease [122 kB]                                                   
#6 0.441 Get:2 http://deb.debian.org/debian buster-updates InRelease [56.6 kB]                                          
#6 0.507 Get:3 http://security.debian.org/debian-security buster/updates InRelease [34.8 kB]                            
#6 0.847 Reading package lists...                                                                                       
#6 1.369 E: Repository 'http://deb.debian.org/debian buster InRelease' changed its 'Suite' value from 'stable' to 'oldstable'
#6 1.369 E: Repository 'http://deb.debian.org/debian buster-updates InRelease' changed its 'Suite' value from 'stable-updates' to 'oldstable-updates'
#6 1.369 E: Repository 'http://security.debian.org/debian-security buster/updates InRelease' changed its 'Suite' value from 'stable' to 'oldstable'
------
executor failed running [/bin/bash --login -c apt-get update &&     apt-get install -y procps                    vim]: exit code: 100

I tried using the official python:3.10 image and could not reproduce the leak with that.

agronholm avatar Sep 03 '22 08:09 agronholm

@agronholm Please try this Dockerfile. Seems like it is related to python version.

FROM python:3.7.12

RUN pip install fastapi==0.81.0 \
                uvicorn==0.18.3 \
                requests

WORKDIR /home/root/leak

COPY client.py client.py
COPY server.py server.py

zhm9484 avatar Sep 03 '22 12:09 zhm9484

I can reproduce it on Python 3.7.13, but it's not reproducible from 3.8+.

Notes:

  • This issue can be reproduced with pure Starlettte - this means this issue can be closed here.
  • This issue cannot be reproduced on Python 3.8+.
  • This issue cannot be reproduced with Starlette 0.14.2 (version pre-anyio) in Python 3.7.
  • This issue can only be reproduced by Starlette 0.15.0+ on Python 3.7.

I'll not spend more time on this issue. My recommendation is to bump your Python version.

In any case, this issue doesn't belong to FastAPI.

Kludex avatar Sep 03 '22 12:09 Kludex

@Kludex Thanks. The leak happens even in Python 3.8 if modifying the number of middlewares from 3 to 5. I have opened a discussion here -- https://github.com/encode/starlette/discussions/1843

zhm9484 avatar Sep 04 '22 10:09 zhm9484