djongo icon indicating copy to clipboard operation
djongo copied to clipboard

Djongo keeps closing and creating new MongoClient DB connections with each request

Open martin-marko opened this issue 4 years ago • 5 comments

I'm using djongo to connect my Django REST framework API with my MongoDB cluster, and when logging the requests at DEBUG level, I see that djongo starts every request by closing the existing MongoClient connection, creating a new connection, doing the query, then closing the connection again.

2021-09-18 13:41:34,714 - DEBUG - djongo.base - Existing MongoClient connection closed
2021-09-18 13:41:34,715 - DEBUG - djongo.base - New Database connection
2021-09-18 13:41:34,716 - DEBUG - djongo.sql2mongo.query - sql_command: ...
2021-09-18 13:41:35,340 - DEBUG - djongo.sql2mongo.query - Aggregation query: ...
2021-09-18 13:41:35,343 - DEBUG - djongo.sql2mongo.query - Result: ...
2021-09-18 13:41:35,454 - DEBUG - djongo.base - MongoClient connection closed

Why does it close the connection with every request and how can I stop this behavior? It should initiate the DB connection when the web server starts and keep using that connection for all the requests, instead of creating hundreds of connections each second.

Relevant codes from djongo/base.py:

if self.client_connection is not None:
    self.client_connection.close()
    logger.debug('Existing MongoClient connection closed')

self.client_connection = Database.connect(db=name, **connection_params)
logger.debug('New Database connection')
def _close(self):
    """
    Closes the client connection to the database.
    """
    if self.connection:
        with self.wrap_database_errors:
            self.connection.client.close()
            logger.debug('MongoClient connection closed')

For the installation and configuration, I've followed the djongo GitHub instructions, so it goes like this:

DATABASES = {
    "default": {
        "ENGINE": "djongo",
        "NAME": "django",
        "CLIENT": {
            "host": f"mongodb+srv://{MONGODB_PATH}",
        }
    }
}

(MONGODB_PATH = f"{MONGODB_DJANGO_USER}:{MONGODB_DJANGO_PASS}@{MONGODB_CLUSTER}" from environment variables)

The API endpoint I'm testing with goes like this:

@api_view(["GET"])
@authentication_classes([TokenAuthentication])
@permission_classes([IsAuthenticated])
def test(request):
    if request.method == "GET":
        data = {"user": str(request.user)}
        return JsonResponse(data)

Versions:

Django==3.2.7
djangorestframework==3.12.4
djongo==1.3.6
pymongo==3.12.0 (MongoDB 4.4.8 Enterprise)

martin-marko avatar Sep 18 '21 12:09 martin-marko

# To prevent leaving unclosed connections behind,
# client_conn must be closed before a new connection
# is created.
if self.client_connection is not None:
    self.client_connection.close()
    logger.debug('Existing MongoClient connection closed')

As you mentioned, this is the culprit. This seems to be the intended behavior as it is commented, however I don't agree here.

The pymongo client maintains its own pool of connections, which are intended to be reused after initialization because the process of establishing connection is expensive, especially when the database is on a different server.

The implementation here is creating a connection, querying the connection, then closing it, on every single request, which defeats the whole purpose of the built in connection pool in pymongo client. Not to mention it degrades performance significantly. If you do a google search of how to handle database connections on webservers, basically every answer has some mention of keeping connections alive and sharing them between requests, which again is the exact opposite of what's going on here.

I'm pretty confident this is not optimal behaviour, but I'm not super familiar with the codebase so maybe there is a reason for this implementation. The comment makes me think otherwise: it says we don't want to leave unclosed connections behind, but that's exactly what you should be doing...

In my case, I am using mongodb as an auth backend, so every request I make that requires authentication is going through this process, which degrades my app performance noticeably. If this isn't fixed I'll have to move away from it.

This seems like a relatively easy fix. Removing this chunk of code that closes the connection, and setting client MaxPoolSize=1 since there are no threads. Happy to open PR if needed.

Adrian658 avatar Oct 24 '21 05:10 Adrian658

I agree, this makes no sense at all. I still haven't found a solution for it, and it still makes every single request take 10x longer than it should. Please let me know if you have a working edit I could do some tests with!

martin-marko avatar Oct 24 '21 06:10 martin-marko

I ran into the same issue. After inspecting the standard backends in Django and how they are implemented I also think the following check is not needed

# To prevent leaving unclosed connections behind,
# client_conn must be closed before a new connection
# is created.
if self.client_connection is not None:
    self.client_connection.close()
    logger.debug('Existing MongoClient connection closed')

I did remove it and so far things are working great. Happy to submit a PR.

svepe avatar May 04 '22 14:05 svepe

I've done the changes that @Adrian658 and @svepe have suggested and this definitely decreased the amount of connect/disconnect I am observing on the MongoDB container. Setting CONN_MAX_AGE: None did not have any impact. This change though solved one of my issues and actually as suggested in #568 made it much snappier!

@nesdis Is there a way to incorporate this either through a flag or directly?

23pointsNorth avatar May 06 '22 07:05 23pointsNorth

Did this ever gain traction, get a PR? Seeing way too much time in our socket calls relative to the actual db operations, I think this would help

zcavazos avatar Mar 18 '24 21:03 zcavazos