pytest-django icon indicating copy to clipboard operation
pytest-django copied to clipboard

Brutal connection cleanup in teardown of django_db_setup

Open peterlauri opened this issue 7 years ago • 13 comments

We are having a few projects that suffer from hanging connections to the DB, can be from external processes or threads that holds an connection to the database, and causes tests to fail from time to time.

Below is the fixture we have in all our conftests.py to make the cleanup more brutal, to assure that there are no hanging connections around that can cause the tests to fail...

Are we alone having this problems? It is not a big deal having this specific extension of django_db_setup, but if there are plenty of people doing similar hacks, it should be considered :)

@pytest.yield_fixture(scope='session')
def django_db_setup(django_db_setup, django_db_blocker):
    """
    Fixture that will clean up remaining connections, that might be hanging
    from threads or external processes. Extending pytest_django.django_db_setup
    """

    yield

    with django_db_blocker.unblock():
        from django.db import connections

        conn = connections['default']
        cursor = conn.cursor()
        cursor.execute("""SELECT * FROM pg_stat_activity;""")
        print('current connections')
        for r in cursor.fetchall():
            print(r)

        terminate_sql = """
            SELECT pg_terminate_backend(pg_stat_activity.pid)
            FROM pg_stat_activity
            WHERE pg_stat_activity.datname = '%s'
                AND pid <> pg_backend_pid();
        """ % conn.settings_dict['NAME']
        print('Terminate SQL: ', terminate_sql)
        cursor.execute(terminate_sql)

peterlauri avatar Nov 21 '16 15:11 peterlauri

It's the first time for me to hear about this requirement.

blueyed avatar Nov 21 '16 17:11 blueyed

It works well as we have it now. I know more projects that have the same issue, will find some references for it. For example celery workers that have connection to the db. @ionelmc

peterlauri avatar Nov 21 '16 17:11 peterlauri

This is an example of exceptions we get if we don't brute force terminate the connections that might be active...

==================================== ERRORS ====================================
_______ ERROR at teardown of test_subscriber_subdomain_valid_prefix[a-b] _______
[gw0] linux2 -- Python 2.7.9 /home/travis/build/dolphinkiss/tailor-experience/.tox/pytest-travis-non-functional/bin/python2.7
def teardown_database():
        with _django_cursor_wrapper:
>           teardown_databases(db_cfg)
.tox/pytest-travis-non-functional/lib/python2.7/site-packages/pytest_django/fixtures.py:58: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.tox/pytest-travis-non-functional/lib/python2.7/site-packages/django/test/runner.py:509: in teardown_databases
    connection.creation.destroy_test_db(old_name, self.verbosity, self.keepdb)
.tox/pytest-travis-non-functional/lib/python2.7/site-packages/django/db/backends/base/creation.py:264: in destroy_test_db
    self._destroy_test_db(test_database_name, verbosity)
.tox/pytest-travis-non-functional/lib/python2.7/site-packages/django/db/backends/base/creation.py:283: in _destroy_test_db
    % self.connection.ops.quote_name(test_database_name))
.tox/pytest-travis-non-functional/lib/python2.7/site-packages/django/db/backends/utils.py:64: in execute
    return self.cursor.execute(sql, params)
.tox/pytest-travis-non-functional/lib/python2.7/site-packages/django/db/utils.py:95: in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <django.db.backends.utils.CursorWrapper object at 0x7f2e451cf090>
sql = 'DROP DATABASE "test_project_gw0"', params = None
    def execute(self, sql, params=None):
        self.db.validate_no_broken_transaction()
        with self.db.wrap_database_errors:
            if params is None:
>               return self.cursor.execute(sql)
E               OperationalError: database "test_project_gw0" is being accessed by other users
E               DETAIL:  There is 1 other session using the database.

peterlauri avatar Nov 22 '16 14:11 peterlauri

@peterlauri, thanks for leaving this reference here. I've been running into, and trying to diagnose this problem in my work codebase for the last couple of days.

The issue is intermittent. It fails with the same operational error you saw above in about 40-60% of the test runs. Other times it works just fine. From digging in and printing entries in pg_stat_activity for my test database, I can see that the offending sql query appears to be

SELECT "django_migrations"."app", "django_migrations"."name" FROM "django_migrations"

The failure persists, whether I use --no-migrations or not. However, if I use --reuse-db, then the error never happens. This is "great", I guess, in that it points to database setup as the culprit. I suspect that the connection the session-scoped django_db_setup fixture uses is different from the one used by the function-scoped db fixture and test code and is somehow not being released before the test run fails, but I'm not yet certain of that theory.

Have you made any progress with this?

(N.B. I am using Django 2.0.3 and Python 3)

cewing avatar Apr 13 '18 21:04 cewing

For information:

I have isolated the problem to an issue with transaction management. It's not clear at the moment whether the problem lies in Django or in pytest_django, so I'm digging further. Here's what I know now:

When the pytest_django db fixture executes for the first time after the completion of the django_db_setup fixture (in other words, in preparation for the first test), and the test_case._pre_setup() method is executed we end up entering an atomic block using this code:

# django.test.testcases.TestCase

    @classmethod
    def _enter_atomics(cls):
        """Open atomic blocks for multiple databases."""
        atomics = {}
        for db_name in cls._databases_names():
            atomics[db_name] = transaction.atomic(using=db_name)
            atomics[db_name].__enter__()
        return atomics

When transaction.Atomic.__enter__ is invoked it gets the connection to the database and checks the current transaction status:

    def __enter__(self):
        connection = get_connection(self.using)

        if not connection.in_atomic_block:
            # Reset state when entering an outermost atomic block.
            connection.commit_on_exit = True
            connection.needs_rollback = False
            if not connection.get_autocommit():

It is here that things go awry. During successful test passes, the connection at that point is in an atomic block, and so we do not enter that block of code. When test passes fail, the connection is not in an atomic block, and so we end up invoking connection.get_autocommit(). One of the effects of this is to call connection.ensure_connection(), which has the effect in this case of establishing a brand new connection to the database.

When we skip doing this, the connection to the database that is used for each test executed is the same connection as the one used for setting up the database. When we don't skip it, then a new connection is used for running tests, and the connection from database setup is left open

When test execution is completed, Django closes a database connection, but because there are two connections open, the other is left open. When DROP DATABASE is called from the _nodb_connection, this abandoned session blocks the operation from succeeding, and we get the OperationalError referenced here.

I'm not clear if this is a problem that can be cleanly solved. I could use a bit of help in trying to figure that out.

In any case, the open connection is no longer doing any useful work, so it is likely safe to use the hammer approach suggested in this issue.

cewing avatar Apr 14 '18 01:04 cewing

@blueyed, @pelme, tagging you both here to draw attention to this issue. Hoping to get some feedback if y'all have any ideas on clever solutions.

cewing avatar Apr 14 '18 01:04 cewing

Just for clarification: is this with 3.1.2? Have you tried master? (soon to be released as 3.2.0)

blueyed avatar Apr 14 '18 10:04 blueyed

@blueyed, yes it is 3.1.2. I haven’t tried master, but will on Monday. I’ll post the outcome here, and dig a bit further to root cause this if it persists.

cewing avatar Apr 14 '18 22:04 cewing

@blueyed: I can verify that the same problems occur on master as in 3.1.2. Working on a root cause.

cewing avatar Apr 16 '18 19:04 cewing

@blueyed

I've found the root cause of our problem. It's related to the fact that the settings we use for our tests and our local development include two configured databases, one is our primary and the other a replica. Looking at the documentation for pytest-django, it's clear that multiple databases are not supported, so I don't believe that the right fix for this issue lies here. But I'm going to write out what happened anyway, in case it helps someone else in the future.

When the django_db_setup fixture calls setup_databases to bootstrap the test database, it invokes Django's own create_test_db, which uses django.test.utils.get_unique_databases_and_mirrors to figure out which databases need to be created. If there are primary/replica database entries in DATABASES, then the connection used to initially create the database and run migrations may be the primary or it may be the replica.

In our case, this was not initially detected because the configuration for each in our test settings are identical. But the connection to each is a unique object. So if creating the database picked the replica, then the final action performed on that connection left the connection open, and prevented the database from being torn down at the end.

The appropriate solution to our problem is in fact documented here. By adding information to the database configuration indicating that the replica is to be treated as a mirror of the primary during tests, the problem disappears.

Thanks for the patience.

cewing avatar Apr 16 '18 22:04 cewing

I am running into the same issue with only using a single database ("default") and am using this fix:

    def teardown_database():
        with django_db_blocker.unblock():
            # close unused connections
            for connection, old_name, destroy in db_cfg:
                if not destroy:
                    connection.close()
            # delete test databases
            for connection, old_name, destroy in db_cfg:
                if not destroy:
                    continue
                connection.creation.destroy_test_db(
                    old_name, request.config.option.verbose, False
                )
                connection.close()

The problem was that the connections were not being closed properly in the main function.

Full context:

@pytest.fixture(scope="session")
def django_db_setup(request, django_test_environment, django_db_blocker):
    """Top level fixture to ensure test databases are available"""
    from pytest_django.compat import setup_databases

    with django_db_blocker.unblock():
        db_cfg = setup_databases(
            verbosity=request.config.option.verbose,
            interactive=False,
        )

    def teardown_database():
        with django_db_blocker.unblock():
            # close unused connections
            for connection, old_name, destroy in db_cfg:
                if not destroy:
                    connection.close()
            # delete test databases
            for connection, old_name, destroy in db_cfg:
                if not destroy:
                    continue
                connection.creation.destroy_test_db(
                    old_name, request.config.option.verbose, False
                )
                connection.close()

    request.addfinalizer(teardown_database)

snowman2 avatar Jan 02 '19 17:01 snowman2

@snowman2, do you have an updated fixture for pytest-django 4.5.2? I tried your fixture, but this line causes an error since there's no compat module in the latest version: from pytest_django.compat import setup_databases I tried from django.test.utils import setup_databases instead but receive the 'database is being accessed by other users' error when it tries to drop the database: DETAIL: There is 1 other session using the database.

NathanSmeltzer avatar Apr 06 '23 11:04 NathanSmeltzer

That is a few years old. I don't think that I am using that anymore, but I don't remember what project I was using that for to verify. Current projects don't use that block of code.

snowman2 avatar Apr 06 '23 12:04 snowman2