"user_favorite_tag" table not created after "superset db upgrade" when upgrade superset 3.0.1 to 4.0.2 on docker
Bug description
Hi Superset developers: I am trying to upgrade superset 3.0.1 to 4.0.2, main features looks good so far, but there is a small problem when I am trying to use Tag system, please check the screenshot as attached, favorite tag feature seems not available, after check log, error trace shows "user_favorite_tag" table not exists, it seems "superset db upgrade" not patched revision "e0f6f91c2055" to DB, but the "superset db history" shows "e0f6f91c2055" has been applied, would you please take a look?
ENV: Docker MacOS/Safari superset version: 3.0.1 -> 4.0.2
Stack trace:
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py", line 1631, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
return connection._execute_clauseelement(
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py", line 1498, in _execute_clauseelement
ret = self._execute_context(
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py", line 1862, in _execute_context
self._handle_dbapi_exception(
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py", line 2043, in _handle_dbapi_exception
util.raise_(
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/util/compat.py", line 207, in raise_
raise exception
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
self.dialect.do_execute(
File "/usr/local/lib/python3.10/dist-packages/sqlalchemy/engine/default.py", line 732, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "user_favorite_tag" does not exist
LINE 2: FROM user_favorite_tag
^
[SQL: SELECT user_favorite_tag.tag_id AS user_favorite_tag_tag_id FROM user_favorite_tag WHERE user_favorite_tag.tag_id IN (%(tag_id_1_1)s, %(tag_id_1_2)s, %(tag_id_1_3)s, %(tag_id_1_4)s, %(tag_id_1_5)s) AND user_favorite_tag.user_id = %(user_id_1)s]
How to reproduce the bug
- superset 3.0.1 installed on a clean machine, and everything works(make sure DB tables are created)
- git clone source code to local
- run: TAG=4.0.2 docker compose -f docker-compose-non-dev.yml up
- check DB tables to see if "user_favorite_tag" table exits
- PS: Manully run Upgrading also not working
Screenshots/recordings
No response
Superset version
4.0.2
Python version
3.9
Node version
16
Browser
Chrome
Additional context
No response
Checklist
- [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Hey @jinmengcheng! I'm here to help you with any bugs, questions, or contributions you may have regarding the Superset upgrade. Let's work together to solve this issue!
To resolve the issue of the user_favorite_tag table not being created when upgrading Apache Superset from version 3.0.1 to 4.0.2 using Docker on MacOS, you need to ensure that the Alembic migrations are applied correctly. Here are the steps to follow:
-
Ensure the migration script is in place: Verify that the migration script
2023-07-12_20-34_e0f6f91c2055_create_user_favorite_table.pyis located in the correct directory within your Superset project, typically undersuperset/migrations/versions/. -
Shut down running containers: First, make sure to shut down the running containers in Docker Compose:
docker compose down -
Update the Superset repository: Update the folder that mirrors the
supersetrepo through git:git pull origin master -
Restart the containers: Restart the containers and any changed Docker images will be automatically pulled down:
docker compose up -
Run the migration: Execute the Alembic upgrade command to apply the migrations. You can do this by running the following command inside your Docker container:
docker exec -it <container_name> superset db upgradeReplace
<container_name>with the name of your Superset Docker container. -
Initialize Superset: Run the following command to initialize Superset:
docker exec -it <container_name> superset init -
Verify the table creation: After running the migration, check your database to ensure that the
user_favorite_tagtable has been created.
The upgrade function in the migration script should create the user_favorite_tag table as follows:
def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
"user_favorite_tag",
sa.Column("user_id", sa.Integer(), nullable=False),
sa.Column("tag_id", sa.Integer(), nullable=False),
sa.ForeignKeyConstraint(
["tag_id"],
["tag.id"],
),
sa.ForeignKeyConstraint(
["user_id"],
["ab_user.id"],
),
)
# ### end Alembic commands ###
Additionally, ensure that the user_favorite_tag table is correctly defined in your models. According to the superset/tags/models.py file, the table should be defined as follows:
user_favorite_tag_table = Table(
"user_favorite_tag",
Model.metadata, # pylint: disable=no-member
Column("user_id", Integer, ForeignKey("ab_user.id")),
Column("tag_id", Integer, ForeignKey("tag.id")),
)
If the table is still not created, ensure that there are no errors in the migration script and that the database connection is properly configured [1][2][3].
To continue talking to Dosu, mention @dosu.
Anyone can help on this issue?
Having the exact same issue with a Superset instance deployed in k8s via helm chart. The migration commands (https://superset.apache.org/docs/installation/upgrading-superset/#upgrading-the-metadata-database) didn't change the table structure.
I do believe @michael-s-molina / @mistercrunch / @eschutho are looking at migration issues a bit, in case they spot anything here that can be resolved systemically.
thanks @rusackas , waiting for good news.
@ruifpedro I am also running superset on K8S via helm chart, upgrading superset on K8S is not a happy thing
superset db history shows the list of migrations that exist in the repo, not the one that have been applied to the database. I'm guessing your superset db upgrade either failed or wasn't executed properly against the current branch.
superset db current should show the latest applied migration, please run it and report back here.
Note that when db upgrade fails, it's possible to be stuck in a bad state and sometimes migration scripts need to be massaged to go through. If any step in the migration cannot be run twice (say a table creation step) and the script failed after (say creating an index on top of the new table), the second time you'll run the migration script you'll fail earlier than the original failure.
Hi @mistercrunch thanks for help on chekcing the db upgrade issue, below is a log list showing full logs of my superset_init job when I running "TAG=4.0.2 docker compose -f docker-compose-non-dev.yml up" command line to launch superset via docker, from the log seems everything worked fine, from the screenshot(docker-init.sh), step 1 applying DB migration will run "superset db upgrade" command
superset_init | Installing local overrides at /app/docker/requirements-local.txt
superset_init | Collecting sqlalchemy-trino
superset_init | Downloading sqlalchemy_trino-0.5.0-py3-none-any.whl (6.8 kB)
superset_init | Collecting Authlib
superset_init | Downloading Authlib-1.3.1-py2.py3-none-any.whl (223 kB)
superset_init | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 223.8/223.8 kB 267.5 kB/s eta 0:00:00
superset_init | Collecting statsd
superset_init | Downloading statsd-4.0.1-py2.py3-none-any.whl (13 kB)
superset_init | Collecting trino[sqlalchemy]>=0.310
superset_init | Downloading trino-0.329.0-py3-none-any.whl (53 kB)
superset_init | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.0/53.0 kB 1.9 MB/s eta 0:00:00
superset_init | Requirement already satisfied: cryptography in /usr/local/lib/python3.9/site-packages (from Authlib->-r /app/docker/requirements-local.txt (line 2)) (42.0.4)
superset_init | Requirement already satisfied: python-dateutil in /usr/local/lib/python3.9/site-packages (from trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (2.8.2)
superset_init | Requirement already satisfied: pytz in /usr/local/lib/python3.9/site-packages (from trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (2021.3)
superset_init | Requirement already satisfied: requests>=2.31.0 in /usr/local/lib/python3.9/site-packages (from trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (2.31.0)
superset_init | Collecting tzlocal
superset_init | Downloading tzlocal-5.2-py3-none-any.whl (17 kB)
superset_init | Requirement already satisfied: sqlalchemy>=1.3 in /usr/local/lib/python3.9/site-packages (from trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (1.4.36)
superset_init | Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.9/site-packages (from cryptography->Authlib->-r /app/docker/requirements-local.txt (line 2)) (1.15.1)
superset_init | Requirement already satisfied: pycparser in /usr/local/lib/python3.9/site-packages (from cffi>=1.12->cryptography->Authlib->-r /app/docker/requirements-local.txt (line 2)) (2.20)
superset_init | Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/site-packages (from requests>=2.31.0->trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (2023.7.22)
superset_init | Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.9/site-packages (from requests>=2.31.0->trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (3.2.0)
superset_init | Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.9/site-packages (from requests>=2.31.0->trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (1.26.18)
superset_init | Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/site-packages (from requests>=2.31.0->trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (3.2)
superset_init | Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.9/site-packages (from sqlalchemy>=1.3->trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (3.0.3)
superset_init | Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/site-packages (from python-dateutil->trino[sqlalchemy]>=0.310->sqlalchemy-trino->-r /app/docker/requirements-local.txt (line 1)) (1.16.0)
superset_init | Installing collected packages: statsd, tzlocal, trino, Authlib, sqlalchemy-trino
superset_init | Successfully installed Authlib-1.3.1 sqlalchemy-trino-0.5.0 statsd-4.0.1 trino-0.329.0 tzlocal-5.2
superset_init | WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
superset_init |
superset_init | [notice] A new release of pip is available: 23.0.1 -> 24.2
superset_init | [notice] To update, run: pip install --upgrade pip
superset_init | Unknown Operation!!!
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | Init Step 1/4 [Starting] -- Applying DB migrations
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init | logging was configured successfully
superset_init | 2024-08-19 08:18:26,023:INFO:superset.utils.logging_configurator:logging was configured successfully
superset_init | 2024-08-19 08:18:26,029:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>
superset_init | We haven't found any Content Security Policy (CSP) defined in the configurations. Please make sure to configure CSP using the TALISMAN_ENABLED and TALISMAN_CONFIG keys or any other external software. Failing to configure CSP have serious security implications. Check https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP for more information. You can disable this warning using the CONTENT_SECURITY_POLICY_WARNING key.
superset_init | 2024-08-19 08:18:26,030:WARNING:superset.initialization:We haven't found any Content Security Policy (CSP) defined in the configurations. Please make sure to configure CSP using the TALISMAN_ENABLED and TALISMAN_CONFIG keys or any other external software. Failing to configure CSP have serious security implications. Check https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP for more information. You can disable this warning using the CONTENT_SECURITY_POLICY_WARNING key.
superset_init | /usr/local/lib/python3.9/site-packages/flask_limiter/extension.py:293: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
superset_init | warnings.warn(
superset_init | INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
superset_init | INFO [alembic.runtime.migration] Will assume transactional DDL.
superset_init | ##### DEBUG: False
superset_init | Loaded your LOCAL configuration at [/app/docker/pythonpath_dev/superset_config.py]
superset_init | ##### LOG_LEVEL in configuration: 20
superset_init | ##### debug_mode in parameter: False, DEBUG: False
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | Init Step 1/4 [Complete] -- Applying DB migrations
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | Init Step 2/4 [Starting] -- Setting up admin user ( admin / admin )
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init | logging was configured successfully
superset_init | 2024-08-19 08:18:32,744:INFO:superset.utils.logging_configurator:logging was configured successfully
superset_init | 2024-08-19 08:18:32,751:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>
superset_init | We haven't found any Content Security Policy (CSP) defined in the configurations. Please make sure to configure CSP using the TALISMAN_ENABLED and TALISMAN_CONFIG keys or any other external software. Failing to configure CSP have serious security implications. Check https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP for more information. You can disable this warning using the CONTENT_SECURITY_POLICY_WARNING key.
superset_init | 2024-08-19 08:18:32,752:WARNING:superset.initialization:We haven't found any Content Security Policy (CSP) defined in the configurations. Please make sure to configure CSP using the TALISMAN_ENABLED and TALISMAN_CONFIG keys or any other external software. Failing to configure CSP have serious security implications. Check https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP for more information. You can disable this warning using the CONTENT_SECURITY_POLICY_WARNING key.
superset_init | /usr/local/lib/python3.9/site-packages/flask_limiter/extension.py:293: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
superset_init | warnings.warn(
superset_init | ##### DEBUG: False
superset_init | Loaded your LOCAL configuration at [/app/docker/pythonpath_dev/superset_config.py]
superset_init | ##### LOG_LEVEL in configuration: 20
superset_init | ##### debug_mode in parameter: False, DEBUG: False
superset_init | Recognized OAuth Authentication.
superset_init | Error! User already exists admin
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | Init Step 2/4 [Complete] -- Setting up admin user
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | Init Step 3/4 [Starting] -- Setting up roles and perms
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init | logging was configured successfully
superset_init | 2024-08-19 08:18:38,234:INFO:superset.utils.logging_configurator:logging was configured successfully
superset_init | 2024-08-19 08:18:38,241:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>
superset_init | We haven't found any Content Security Policy (CSP) defined in the configurations. Please make sure to configure CSP using the TALISMAN_ENABLED and TALISMAN_CONFIG keys or any other external software. Failing to configure CSP have serious security implications. Check https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP for more information. You can disable this warning using the CONTENT_SECURITY_POLICY_WARNING key.
superset_init | 2024-08-19 08:18:38,242:WARNING:superset.initialization:We haven't found any Content Security Policy (CSP) defined in the configurations. Please make sure to configure CSP using the TALISMAN_ENABLED and TALISMAN_CONFIG keys or any other external software. Failing to configure CSP have serious security implications. Check https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP for more information. You can disable this warning using the CONTENT_SECURITY_POLICY_WARNING key.
superset_init | /usr/local/lib/python3.9/site-packages/flask_limiter/extension.py:293: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
superset_init | warnings.warn(
superset_init | Syncing role definition
superset_init | 2024-08-19 08:18:41,650:INFO:superset.security.manager:Syncing role definition
superset_init | Syncing Admin perms
superset_init | 2024-08-19 08:18:41,692:INFO:superset.security.manager:Syncing Admin perms
superset_init | Syncing Alpha perms
superset_init | 2024-08-19 08:18:41,707:INFO:superset.security.manager:Syncing Alpha perms
superset_init | Syncing Gamma perms
superset_init | 2024-08-19 08:18:42,366:INFO:superset.security.manager:Syncing Gamma perms
superset_init | Syncing sql_lab perms
superset_init | 2024-08-19 08:18:43,264:INFO:superset.security.manager:Syncing sql_lab perms
superset_init | Fetching a set of all perms to lookup which ones are missing
superset_init | 2024-08-19 08:18:43,922:INFO:superset.security.manager:Fetching a set of all perms to lookup which ones are missing
superset_init | Creating missing datasource permissions.
superset_init | 2024-08-19 08:18:43,939:INFO:superset.security.manager:Creating missing datasource permissions.
superset_init | Creating missing database permissions.
superset_init | 2024-08-19 08:18:43,950:INFO:superset.security.manager:Creating missing database permissions.
superset_init | Cleaning faulty perms
superset_init | 2024-08-19 08:18:43,959:INFO:superset.security.manager:Cleaning faulty perms
superset_init | ##### DEBUG: False
superset_init | Loaded your LOCAL configuration at [/app/docker/pythonpath_dev/superset_config.py]
superset_init | ##### LOG_LEVEL in configuration: 20
superset_init | ##### debug_mode in parameter: False, DEBUG: False
superset_init |
superset_init | ######################################################################
superset_init |
superset_init |
superset_init | Init Step 3/4 [Complete] -- Setting up roles and perms
superset_init |
superset_init |
superset_init | ######################################################################
superset_init |
superset_init exited with code 0
Looks like migrations are up to date here. Maybe you're hitting another instance of the app that's on a different/earlier version - some zombie process? Hard to tell, but you should be able troubleshoot that.
@mistercrunch if I change a newly created database, "user_favorite_tag" will make, but migration from an old DB(3.0.1), the table missed, it's very easy to reproduce if we launch superset from helm docker, I thought it's a migration issue. Anyway, is there any command line or any tools can help fix this issue rather than manually create the table?
From inside the box that's running superset, run superset db current, it should point to something prior to the migration that creates user_favorite_tag. It's either that or:
- some shenanigans happened on that db (like someone dropped the table)
- maybe someone messed with the migration precedence either in the repo or locally (say squeezing in that migration or forcing a SHA in the table)
- a false positive on the migration (?) like the create table failed but Alembic didn't know (?) I've never seen this before...
- your app isn't pointing to the same place as where you think the migration ran (?)
Hi @mistercrunch thanks for your reply and sorry reply late here due some personal businesses, I checked superset db current, it's pointing the head please see screenshot on "superset db current" and "superset db history" command line. Please also see my comments below for your 4 checklist may cause this issue:
- some shenanigans happened on that db (like someone dropped the table) This gonna not happen here, 😄
- maybe someone messed with the migration precedence either in the repo or locally (say squeezing in that migration or forcing a SHA in the table) Confirmed no one is doing migration except me
- a false positive on the migration (?) like the create table failed but Alembic didn't know (?) I've never seen this before... There is no error log on Alembic side, I have tested in a newly created DB(no tables), and directly run superset 4.0.2, user-favorite_tag table is created, as you mentioned you have never seen this before, I would consider this is very very tiny optopportunity to happen
- your app isn't pointing to the same place as where you think the migration ran (?) I did not quite understand what this scenario, would you please kindly provide more details or take an example here?
your app isn't pointing to the same place as where you think the migration ran (?)
Oh that's like if you had some sort of zombie process or pod that would happen to be serving the new version of the app against some other/older version of the database. Could happen if you're juggling with multiple instances of the app or the database and have zombie processes or deployments lying around. I guess you could stop all processes, confirm you get no responses at all, and then restarting the app, and make sure it's pointing to the right / upgraded database.
@mistercrunch zombie process or zombie docker pod would not happened in my case, each time I will stop docker by running "docker compose down" command to stop all pods and double check by running "docker ps" to check if there are alive or zombie docker pods
I'm kind of clueless on my side at this point. Don't have any idea other than suggesting to go back and double check assumptions one-by-one.
- restore backup of database pre-migration
- run
superset db currentmake sure it's where you think it should be (head of previous version), double check the alembic version table in the database, make sure all 3 match, - check that
user_favorite_tagis NOT present - run migration upgrade
- again - run
superset db currentmake sure it's where you think it should be (head of current version), double check the alembic version table in the database, make sure all 3 match, - check that
user_favorite_tagIS present
@jinmengcheng are you still facing this? It's been silent for quite a while. Maybe @dosu-bot can help?
Ran into the same issue today and it looks like the problem is due to two migration versions being merged in that were lower in place than the previous head of an older release.
3.0.1 head 2023-09-15_12-58 4b85906e5b91 4.0.2 head 2024-02-14_14-43 17fcea065655
The two that were merged in between these two versions but did not increment are 2023-07-12_20-34_e0f6f91c2055_create_user_favorite_table.py 2023-07-19_17-54_ee179a490af9_deckgl_path_width_units.py
I was able to work past this by just running those two versions.
# We get the current version of the database
> $ superset db current
17fcea065655 (head)
# Set the alembic_version.version_num to the first version that was merged in between the releases
> $ superset db stamp bf646a0c1501
INFO [alembic.runtime.migration] Running stamp_revision 17fcea065655 -> bf646a0c1501
# Upgrade to the second version that was merged in between the releases, this works since they are adjacent versions
> $ superset db upgrade ee179a490af9
INFO [alembic.runtime.migration] Running upgrade bf646a0c1501 -> e0f6f91c2055, create_user_favorite_table
INFO [alembic.runtime.migration] Running upgrade e0f6f91c2055 -> ee179a490af9, deckgl-path-width-units
# Set the alembic_version.version_num back to what it was so future upgrades progress naturally
> $ superset db stamp 17fcea065655
INFO [alembic.runtime.migration] Running stamp_revision ee179a490af9 -> 17fcea065655
A check for ensuring that any added migrations/versions are newer than the latest one in master should probably prevent PR from merging if not true. It's not so much a problem within a release but if it crosses a release line then we'll see this.
A check for ensuring that any added migrations/versions are newer than the latest one in master should probably prevent PR from merging if not true
I thought it was already the case and I'm still confused as to what the exact issue is. Presumably there are out-of-order migrations in one of the micro releases (?) Maybe in 3.0.1 ? If that's the case, presumable it would only affect people who didn't skip 3.0.1.
Won't have time to test, but if that's the case, the steps for repro would be something like:
- nuke the database
- pull tag 3.0.0 and build/start
- pull tag 3.0.1 and build/start
- pull tag 4.1.2 and build/start
The last step here should hit this issue (?)
Assuming this is all true, the fix would be to identify the release with out-of-order migrations (seems it may be 3.0.1) and alter the ids to align with what's in the following releeases
Crazy thing is that if you are on 3.0.1 now, your current migration is wrong, so we'd have to create a 3.0.2 that fixes things and instruct people to migrate to that first ... Quite a bit of work here, and solution is still confusing. Instructions would be something like "if you're on 3.0.1, you MUST upgrade to 3.0.2 PRIOR to migrating to following versions". Unless there are other solutions, but can't think of any RN...
That is pretty much how I was able to reproduce in development and test the "fix" before trying it in our staging environment. Holding off on doing the same in production for a bit to get feedback on the problem.
- Pull tag 3.0.1, build, and start a. This will set the alembic version to 4b85906e5b91
- Pull tag 4.0.2, build, and start a. This will start the upgrade from 4b85906e5b91 to 17fcea065655 and skip e0f6f91c2055 and ee179a490af9
The head of 2.1.3 is 2023-05-01_12-03_9c2a5681ddfd. The head of 3.0.1 is 2023-09-15_12-58_4b85906e5b91. The head of 3.1.0 is 2023-12-01_12-03_b7851ee5522f.
But if I were to take a look at 2023-07-12_20-34_e0f6f91c2055 I see it was introduced in 3.1.0 which would mean anyone coming from 2.1.3 has to go straight to 3.1.0 otherwise you would not apply those two versions to your database. The problem I see is that if they got on 3.0.x and did any upgrade after even to 4.x then they are in this position so the instructions would really be weird; if you were ever on, or currently on, 3.0.x you have to apply this patch or do some weird migration thing.
I only noticed this because we use tagging in our setup, or rather wanting to use it more, and when trying to manage tags I got that error. My guess is maybe that feature flag isn't being used by many while in development so the error isn't very common.
Wondering if there's a way to just fix the latest release so that you can migrate from anywhere to latest and future destinations. Maybe by using the branching features in alembic.
I have to admit I'm still a little confused and that a graph diagram would really help here. Still unclear whether one of the migration is missing in some places, or whether it's just in a different order across releases.
From my understanding alembic has support for some branching as long as you merge the branches. There's probably a way to fix this in master and future releases by wiggling shas into an alembic branch (?)
Also trying to understand the population that's affected. Sounds like it's only affecting people who have migrated to people that followed a precise upgrade path?
For me the easiest way to see the problem is doing a diff on the superset db history output from 3.0.1 and 4.0.2. There should only be newer versions but we can see that there are 2 that come in in the middle which are the ones that are skipped.
I think anyone that ever got on 3.0.x is affected by this. If they went from <=2.x to >=3.1.x it works out. I'm not too familiar with alembic so not sure about it's branching features but you could adjust the two versions to be idempotent, I don't think they are now, and move them up in version to have any new updates work. If someone doesn't want do a major release bump I'm not sure how that can be solved.
Actually looking at that screenshot there are 3 that were skipped but one of them was caught and another commit added to replay it: https://github.com/apache/superset/commit/aaa50c4b4aba30a95154dfb2cc46d6788b3a1ccc
Clearly there's something to learn here and prevent in the future. Tagging @sadpandajoe @michael-s-molina for visibility and avoidance in future releases (unclear to me who worked on these releases...). I think it's well understood that no releases should be a db migration dead-end, and curious to understand what happened. Maybe one or two PRs including a database migration got cherry-picked by mistake into a release branch ...