pytest-django When running pytest on a database with --reuse-db where python manage.py migrate was already run, cause failure on old runpython

trafficstars

I explain myself,

I have a project with a lot of migrations, with 3 important migrations for our case (in that order):

a migration that add the field foo to the model Bar
a migration that has a RunPython, doing some computation based on the field foo
a migration that remove the field foo

now if i first do

python manage.py migrate

I finish with a database without field foo => normal and expected

if I then run pytest --reuse-db I will got SQL failure because pytest re-runs all migrations, but without actually replaying the changes on the database , but only running the RunPython, so the RunPython of 2) is run on database were the field has already been removed

is it a known behaviour , if so why ?

I would have thought that it would see migrations has been run , and will not execute anything, runpython included

Aug 19 '19 21:08 allan-simon

note that it does not seem to be deterministic (I've removed --reuse-db and it passed once in CI, and now it does not work anymore, and our CI are on aws spot instances, which mean they are recycled pretty often, so we don't have any dangling state between two CI run)

Aug 20 '19 06:08 allan-simon

if I then run pytest --reuse-db I will got SQL failure because pytest re-runs all migrations

Use --no-migrations then with --reuse-db (but I would assume that's the case already)?

Use -vvv to see when migrations are run.

I also suggest trying to debug this locally, where you can better add breakpoints etc.. ;) (but for CI you could also add printing of course - hopefully -vvv gives more insight though) (https://github.com/pytest-dev/pytest-django/pull/725 adds a django_debug_sql ini option that might be useful - I've just rebased it on master)

Aug 20 '19 06:08 blueyed

I will try -vvv, for --no-migrations, no I didn't run it because I was not aware of the options (I don't know how i missed it ^^' )

Aug 20 '19 06:08 allan-simon

I would expect --reuse-db to not cause any migrations to be run. If that's the case (maybe even only for the RunPython ones) it looks like a bug / something to improve I would assume.

What Django version are you using, and what kind of DB?

Aug 20 '19 07:08 blueyed

postgresql 10 and django 1.11 , python3.6

Aug 20 '19 07:08 allan-simon

Also note that it will be run for the first usage of --reuse-db (since there is none to reuse).

This is the current behavior:

diff --git i/tests/test_db_setup.py w/tests/test_db_setup.py
index 0b3d516..ba96b14 100644
--- i/tests/test_db_setup.py
+++ w/tests/test_db_setup.py
@@ -406,3 +406,17 @@ class Migration(migrations.Migration):
         )
         assert result.ret == 0
         result.stdout.fnmatch_lines(["*mark_migrations_run*"])
+
+        # Runs migration with first "--reuse-db" usage.
+        result = testdir.runpytest_subprocess(
+            "--reuse-db", "--tb=short", "-v", "-s"
+        )
+        assert result.ret == 0
+        assert "mark_migrations_run" in result.stdout.str()
+
+        # Does not run them on second "--reuse-db" usage.
+        result = testdir.runpytest_subprocess(
+            "--reuse-db", "--tb=short", "-v", "-s"
+        )
+        assert result.ret == 0
+        assert "mark_migrations_run" not in result.stdout.str()

(with this test: https://github.com/blueyed/pytest-django/blob/37b9216655baa21ae4ccf0b28069c07eb4473cb3/tests/test_db_setup.py#L356-L408)

Aug 20 '19 07:08 blueyed

ok right now i'm focusing on the "without --reuse-db" as here the behaviour should be always consistent (i.e DROP database, and recreate everything) , and even here I'm seeing the bug i'm describing

what is weird is that, the RunPython seems to be run in the correct order of their migrations, and then pytest starts to run the actual tests, but then all the following tests are reporting "ERROR" with the same stacktrace that is in the migration as if the migration was run N time , but the print I added in RunPython code directly only appears once at migration time not at test time

Aug 20 '19 08:08 allan-simon

currently (still without --reuse-db) I seem to be able to trigger the bug if

i clean everything (I use a docker for that, to be sure to start from scratch)
i run the tests once (without --reuse-db) , => I control+c the test , but at least the first tests where executing correctly
I re-run the tests, now i see something weird , the database seems correctly drop/recreated according to this output :

acts/tests.py::DuplicateTest::test_duplicate_products Creating test database for alias 'read-only' ('test_vagrant')...
Got an error creating the test database: database "test_vagrant" already exists

Destroying old test database for alias 'read-only' ('test_vagrant')...

however at the migration step, I see far less migration being applied than I actually have (they are still run in order though) then migrations steps stop and the tests start to run, and the first tests are in Errror, and actually they contains the stack trace of the last migration that run.

i.e it tries to run migrations, one of these migrations fails with an exception, (due to the diff postgres schema / django model at the point in time) it goes directly to running the test and only there it shows the exceptions (and duplicated)

what is even weirder is that some tests still succeed (but a very low minority)

Aug 20 '19 09:08 allan-simon

I start to wonder if it's due to the fact I have two database, "default" and "read-only" (in test they both actually correspond to the same database) and I wonder if there's some kind of race condition that RunPython are not run on "read-only" on which the drop database is done (but pytest-django do the magic test_%s ) but run on "default" that is still in database "vagrant" that has not been dropped ?

Aug 20 '19 09:08 allan-simon

It seems to be that during the migrations sometimes "default" is chosen, sometimes "readonly" is chosen

if I add in lib/python3.6/site-packages/django/db/models/query.py (line 156)

  class QuerySet(object):                                                                                                                                                                                                                    
      """                                                                                                                                                                                                                                    
      Represents a lazy database lookup for a set of objects.                                                                                                                                                                                
      """                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                             
      def __init__(self, model=None, query=None, using=None, hints=None):                                                                                                                                                                    
          self.model = model                                                                                                                                                                                                                 
          self._db = using                                                                                                                                                                                                                   
          print(using)

I got for the following migration :

def copy_from_userworkflow_to_invoice(apps, schema_editor):                                                                                                                                                                                  
    db_alias = schema_editor.connection.alias                                                                                                                                                                                                
    UserWorkflow = apps.get_model('useracts', 'UserWorkflow')                                                                                                                                                                                
    for uw in UserWorkflow.objects.using(db_alias).filter(invoice__isnull=False):                                                                                                                                                            
        invoice = uw.invoice                                                                                                                                                                                                                 
        invoice.userworkflow = uw                                                                                                                                                                                                            
        invoice.save()                                                                                                                                                                                                                       
                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                             
def copy_from_invoice_to_userworkflow(apps, schema_editor):                                                                                                                                                                                  
    db_alias = schema_editor.connection.alias                                                                                                                                                                                                
    Invoice = apps.get_model('billing', 'Invoice')                                                                                                                                                                                           
    for invoice in Invoice.objects.using(db_alias).filter(userworkflow__isnull=False):                                                                                                                                                       
        uw = invoice.userworkflow                                                                                                                                                                                                            
        uw.invoice = invoice                                                                                                                                                                                                                 
        uw.save()                                                                                                                                                                                                                            
                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                             
class Migration(migrations.Migration):                                                                                                                                                                                                       
                                                                                                                                                                                                                                             
    dependencies = [                                                                                                                                                                                                                         
        ('useracts', '0009_auto_20150616_1256'),                                                                                                                                                                                             
        ('billing', '0001_initial'),                                                                                                                                                                                                         
    ]                                                                                                                                                                                                                                        
                                                                                                                                                                                                                                             
    operations = [                                                                                                                                                                                                                           
        migrations.AddField(                                                                                                                                                                                                                 
            model_name='invoice',                                                                                                                                                                                                            
            name='userworkflow',                                                                                                                                                                                                             
            field=models.ForeignKey(related_name='userworkflows', on_delete=django.db.models.deletion.SET_NULL, to='useracts.UserWorkflow', null=True),                                                                                      
            preserve_default=True,                                                                                                                                                                                                           
        ),                                                                                                                                                                                                                                   
        migrations.RunPython(                                                                                                                                                                                                                
            copy_from_userworkflow_to_invoice,                                                                                                                                                                                               
            copy_from_invoice_to_userworkflow                                                                                                                                                                                                
        ),                                                                                                                                                                                                                                   
    ]

the following output

None
None
default
None
None
None

if I run a second time i got

None
None
read-only
None
None
None

Aug 20 '19 10:08 allan-simon

ok ok so I think i finally found the solution https://pytest-django.readthedocs.io/en/latest/database.html#tests-requiring-multiple-databases which points to https://docs.djangoproject.com/en/dev/topics/testing/advanced/#topics-testing-primaryreplica

so using ['TEST']['MIRROR'] = 'default' seems to fix the problem

however I think ( I don't know how easy to it is to implement) that if pytest-django does explicitly not support several databases it should:

warns you if you have a DATABASES with several keys
if several databases are defined pick 'default' in a deterministic fashion (and or rely on a config to use a specific other one)

what do you think ?

anyway thanks for your help, it really helped me to be on the right track :)

Aug 20 '19 12:08 allan-simon

I've not read up on your comments yet, but do you use transactional_db in your tests?

Aug 20 '19 15:08 blueyed

git grep transactional_db returns 0 result in my codebase

Aug 20 '19 16:08 allan-simon

git grep transactional_db returns 0 result in my codebase

There are other ways to use it, e.g. with a live_server.

However, your investigation appears to be spot on.

Not sure currently though what we can do about it, and if Django would behave the same (i.e. then it would not really be a bug in pytest-django, but just something we could help with, but maybe we're using Django's internals not correctly here).

See also https://github.com/pytest-dev/pytest-django/issues/76 and https://github.com/pytest-dev/pytest-django/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+multiple+databases.

It would be great to have this improved/fixed indeed.

Aug 20 '19 16:08 blueyed

pytest-django pytest-django copied to clipboard

When running pytest on a database with --reuse-db where python manage.py migrate was already run, cause failure on old runpython

pytest-django
pytest-django copied to clipboard