django-two-factor-auth icon indicating copy to clipboard operation
django-two-factor-auth copied to clipboard

Migration error when upgrading to 1.15.4

Open knyghty opened this issue 1 year ago • 24 comments

Expected Behavior

Migrations work.

Current Behavior

Migrations fail.

Possible Solution

?

Steps to Reproduce (for bugs)

Running migrate after upgrading to 1.15.4 from 1.15.3. Presumably needs the phonenumber plugin.

Context

Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/django/db/migrations/graph.py", line 133, in remove_replaced_nodes
    replacement_node = self.node_map[replacement]
                       ~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: ('two_factor', '0001_squashed_0008_delete_phonedevice')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/./manage.py", line 31, in <module>
    execute_from_command_line(sys.argv)
  File "/opt/venv/lib/python3.11/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/opt/venv/lib/python3.11/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/opt/venv/lib/python3.11/site-packages/django/core/management/base.py", line 412, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/opt/venv/lib/python3.11/site-packages/django/core/management/base.py", line 458, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/django/core/management/base.py", line 106, in wrapper
    res = handle_func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/django/core/management/commands/migrate.py", line 117, in handle
    executor = MigrationExecutor(connection, self.migration_progress_callback)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/django/db/migrations/executor.py", line 18, in __init__
    self.loader = MigrationLoader(self.connection)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/django/db/migrations/loader.py", line 58, in __init__
    self.build_graph()
  File "/opt/venv/lib/python3.11/site-packages/django/db/migrations/loader.py", line 268, in build_graph
    self.graph.remove_replaced_nodes(key, migration.replaces)
  File "/opt/venv/lib/python3.11/site-packages/django/db/migrations/graph.py", line 135, in remove_replaced_nodes
    raise NodeNotFoundError(
django.db.migrations.exceptions.NodeNotFoundError: Unable to find replacement node ('two_factor', '0001_squashed_0008_delete_phonedevice'). It was either never added to the migration graph, or has been removed.

Your Environment

  • Browser and version:
  • Python version: 3.11.x
  • Django version: 4.2.3
  • django-otp version: 1.2.2
  • django-two-factor-auth version: 1.15.4

knyghty avatar Aug 15 '23 13:08 knyghty

I have the exact same issue. I have no issues with the exact same setup, but django-two-factor-auth version 1.15.3

My environment:

Python version: 3.11.x Django version: 4.1.10 django-otp version: 1.2.2 django-two-factor-auth version: 1.15.4

hanckmann avatar Aug 24 '23 14:08 hanckmann

May I ask you to test with 1.15.5, please?

claudep avatar Sep 22 '23 07:09 claudep

The update did not help:

docker-compose -f local.yml run --rm django python manage.py makemigrations
[+] Building 0.0s (0/0)                                                                                                                                                                                                                        
[+] Creating 2/0
 ✔ Container cc_redis     Running                                                                                                                                                                                                 0.0s 
 ✔ Container cc_postgres  Running                                                                                                                                                                                                 0.0s 
[+] Building 0.0s (0/0)                                                                                                                                                                                                                        
PostgreSQL is available
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/migrations/graph.py", line 133, in remove_replaced_nodes
    replacement_node = self.node_map[replacement]
                       ~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: ('two_factor', '0001_squashed_0008_delete_phonedevice')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/manage.py", line 46, in <module>
    main()
  File "/app/manage.py", line 42, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 446, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 440, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 402, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 448, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 96, in wrapped
    res = handle_func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/core/management/commands/makemigrations.py", line 122, in handle
    loader = MigrationLoader(None, ignore_no_migrations=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/migrations/loader.py", line 58, in __init__
    self.build_graph()
  File "/usr/local/lib/python3.11/site-packages/django/db/migrations/loader.py", line 268, in build_graph
    self.graph.remove_replaced_nodes(key, migration.replaces)
  File "/usr/local/lib/python3.11/site-packages/django/db/migrations/graph.py", line 135, in remove_replaced_nodes
    raise NodeNotFoundError(
django.db.migrations.exceptions.NodeNotFoundError: Unable to find replacement node ('two_factor', '0001_squashed_0008_delete_phonedevice'). It was either never added to the migration graph, or has been removed.

And in my requirements file:

django-two-factor-auth==1.15.5  # https://django-two-factor-auth.readthedocs.io/en/stable/index.html

hanckmann avatar Sep 22 '23 07:09 hanckmann

It hasn't helped for me either.

knyghty avatar Sep 22 '23 10:09 knyghty

@knyghty @hanckmann Could you share the python manage.py showmigrations with a working version and after upgrading to the 1.15.5 ?

mlec1 avatar Sep 22 '23 21:09 mlec1

I can share the working one from 1.15.3:

$ ./manage.py showmigrations otp_static otp_totp phonenumber two_factor
otp_static
 [X] 0001_initial
 [X] 0002_throttling
otp_totp
 [X] 0001_initial
 [X] 0002_auto_20190420_0723
phonenumber
 [X] 0001_squashed_0001_initial (2 squashed migrations)
two_factor
 [X] 0001_squashed_0008_delete_phonedevice (8 squashed migrations)

But I can't share the one from 1.15.5 because running the command gives the same error.

knyghty avatar Sep 23 '23 14:09 knyghty

I tried to reproduce the error, but unfortunately, I can't. I use the following

Dockerfile:

FROM python:3.11

COPY . .
RUN pip3 install -r requirements_dev.txt

requirements_dev.txt

# The app itself

# -e .
django-two-factor-auth==1.15.0

# Additional runtime dependencies

twilio
phonenumberslite

# Example app

django-debug-toolbar
django-bootstrap-form
django-user-sessions

# Example app (WebAuthn)

webauthn~=1.6.0

# Testing

coverage
flake8
tox
isort
freezegun

# Translation

transifex-client

# Documentation

Sphinx
sphinx_rtd_theme

# Build

wheel
bump2version
twine

Then I use the example application, run migrate without issue. I create a 2FA device. Then I try to update the django-2fa to different version. All patch version 1.15.X one by one and migrate each time, jumping from 1.15.0 to 1.15.3, migrate, then to 1.15.4 and migrate without issues. I tried different possibilities, but I never face your situation.

If you are able to create an example where the error is happening, that would be great.

mlec1 avatar Sep 24 '23 14:09 mlec1

The error is still present in version 1.15.5.

Is there any work on this? It will be a stopper moving forward!

hanckmann avatar Nov 20 '23 09:11 hanckmann

@hanckmann @knyghty Hello, I am willing to help here, but I can't reproduce the issue you are facing. I use it also on my projects, and the migration went fine.

Would it be possible to get a setup to reproduce the issue somehow ? A copy of your database with random data in it or something like that.

mlec1 avatar Nov 21 '23 21:11 mlec1

Hii, well... I am not sure how to reproduce it. For now I am maintaining a fork in which I removed the problematic migration file. When time allows (not before January) I will try to send a reproducible setup.

hanckmann avatar Nov 26 '23 13:11 hanckmann

Same issue here.

JeroenvO avatar Feb 02 '24 08:02 JeroenvO

@JeroenvO ; for the time being, I am maintaining this fork: https://github.com/hanckmann/django-two-factor-auth-no-squash

Feel free to use it as well. If I manage to resolve the issue and work my way back to this branch, I can give you a heads-up.

hanckmann avatar Feb 02 '24 09:02 hanckmann

Still an issue in 1.16.0

rob101 avatar Mar 29 '24 12:03 rob101

@rob101 could you tell us what versions of Python, Django, and django-otp you are using please

moggers87 avatar Mar 31 '24 02:03 moggers87

@rob101 could you tell us what versions of Python, Django, and django-otp you are using please

And provide a minimal reproducible example... I tried with the example app from this repo or from other personal repo and I couldn't reproduce the issue

mlec1 avatar Mar 31 '24 09:03 mlec1

We are having the same issue while migrating an application from django 3 to django 4.

I will share some of the steps hoping this can help coming to a faster solution, since this issue is blocking for us.

Steps to replicate

  1. Have a django service running django 3.2.25, django-two-factor-auth 1.13.2, python 3.9, database postgres v 12 (with psycopg)
  2. Have the following INSTALLED_APPS:
    INSTALLED_APPS = [
        # ... django stuff, omissis
    
        # two_factor requirements
        'django_otp',
        'django_otp.plugins.otp_static',
        'django_otp.plugins.otp_totp',
    
        # django-two-factor-auth , v 1.13.2
        'two_factor',
    
        # # We used to have this too, then dropped in Jan 2023 because we stopped providing support for phone numbers
        # 'two_factor.plugins.phonenumber',
    ]
    
    It may be noteworthy: we already have applied the migration that creates two_factor_phonedevice, which is causing the trouble, but the instances count() for model PhoneDevice is 0.
  3. Run all the migrations with python manage.py migrate
  4. Create at least one user (using settings.AUTH_USER_MODEL) with a TOTP device and/or backup codes (static)
  5. Upgrade to Django 4.2.13 with python 3.12 (new virtual env)
  6. Upgrade django-two-factor-auth to any version between 1.15.2 (the first compatible with Django 4.2) and 1.16.0 (the latest available at the time of writing
  7. Running python manage.py showmigrations should output the following status:
        two_factor
          [X] 0001_initial
          [X] 0002_auto_20150110_0810
          [X] 0003_auto_20150817_1733
          [X] 0004_auto_20160205_1827
          [X] 0005_auto_20160224_0450
          [X] 0006_phonedevice_key_default
          [X] 0007_auto_20201201_1019
          [ ] 0008_delete_phonedevice
          [ ] 0009_initial
    
    If the two_factor.plugins.phonenumber is uncommented from INSTALLED_APPS, this pending migration will also appear:
        phonenumber
        [ ] 0001_squashed_0001_initial (1 squashed migrations)
    
    I have found the issue both with and without the phonenumbers plugin.
  8. Run the migrations again.

Outcome

No matter wich version/setup I tried (see Scenarios below), I always get this error:

psycopg.errors.DuplicateTable: relation "two_factor_phonedevice" already exists

Scenarios

Here are some different scenarios I tried, all with the same result

  • With two_factor.plugins.phonenumber disabled vs enabled
  • Running two_factor.0008_delete_phonedevice from version 1.15.2, then upgrading to 1.16.0 and run the remaining migrations (we saw that there is a squashed migration that basically skips two_factor.0008_delete_phonedevice)

All with the same result:

  • if plugin phonenumbers is disabled, error is raised during migration two_factor.0009_initial
  • if plugin phonenumbers is enabled, error is raised during migration phonenumber.0001_squashed_0001_initial

Any help will be greatly appreciated.

aarighi avatar May 14 '24 11:05 aarighi

FWIW, because it's triggered by the order in which migrations are loaded, at least in my local testing reversing the order in which two_factor and two_factor.plugins.phonenumber are loaded in settings.py seems to have fixed that locally (though I still have to do some further testing to make sure it's not somehow a local environment fluke).

If the phone number plugin comes first, the loader tries to load a migration that replaces migrations inside two_factor itself before any of the migrations in two_factor have been loaded (not executed, loaded) - therefore the entire thing crashes.

modulozero avatar Jun 07 '24 11:06 modulozero

Interesting! Could you evaluate if adding a dependency on two_factor in phonenumber squashed migration would help fixing this issue?

claudep avatar Jun 07 '24 11:06 claudep

Interesting! Could you evaluate if adding a dependency on two_factor in phonenumber squashed migration would help fixing this issue?

Nope.

I actually think that replaces is evaluated before dependencies, though I'd have to go on a bit of a deep dive into to the MigrationExecutor be sure.

TBH I don't think enforcing a load order is that uncommon, and it's even something that can be reasonably checked for. two_factor.plugins.phonenumber expecting to be loaded after two_factor seems quite reasonable - though it will mean some people's configuration will start screaming at them (but, in a very clear way with an obvious fix - unless someone somehow managed to have something that has to be loaded between two_factor.plugins.phonenumber and two_factor - can't imagine how, though).

modulozero avatar Jun 07 '24 12:06 modulozero

Yep - see MigrationLoader.load_disk and MigrationLoader.build_graph.

It:

There's no attempt to re-order things at this stage that I can see. I think this actually happens much later, inside MigrationExecutor.migration_plan, invoked indirectly via MigrationExecutor.migration by whatever uses the MigrationExecutor (basically, the migrate command).

modulozero avatar Jun 07 '24 13:06 modulozero

Thanks for the thorough exploration :rocket:. In the end, do you have a potential fix to suggest for this issue?

claudep avatar Jun 07 '24 13:06 claudep

TBH I don't have a "perfect" solution - IMO the reasonable thing to do is enforce the load order by checking settings.INSTALLED_APPS and throwing a warning on load if plugins are loaded before the main part - and in the future just relying on that being true. I might look into implementing that later, depending on how much time I have, but can't really promise it.

modulozero avatar Jun 10 '24 10:06 modulozero

Wow. Changing the order of the installed apps actually fixed this issue for (early tests show). This makes me happy. I will ensure it is indeed the case by more testing.

In the mean time, I propose to add a note in the documentation about the order of the plugins (or did I miss that?).

hanckmann avatar Jun 10 '24 10:06 hanckmann

In the mean time, I propose to add a note in the documentation about the order of the plugins (or did I miss that?).

I'm wondering if we could also detect this situation and generate a warning.

moggers87 avatar Aug 03 '24 00:08 moggers87