dbmate
dbmate copied to clipboard
Session variables are leaking across migration scripts
Problem
When a user defines a session variable, the expectation is that the variable will be scoped to the current migration script. However, the migrations seem to reuse the same session, resulting in scope leaks from one migration script to the next. You can recreate this problem using the following migration scripts:
- migration_1
-- migrate:up
create table leak_demo_users (
name varchar(255) not null
);
select 'VALUE_LEAK1' into @leaky_var1;
set @leaky_var2='VALUE_LEAK2';
-- migrate:down
drop table leak_demo_users;
- migration_2
-- migrate:up
insert into
leak_demo_users
values
(@leaky_var1),
(@leaky_var2)
;
-- migrate:down
delete from leak_demo_users;
These migrations will succeed if you run them in sequence, resulting in the following table records, which is not expected behavior.
mysql> select * from leak_demo_users;
+-------------+
| name |
+-------------+
| VALUE_LEAK1 |
| VALUE_LEAK2 |
+-------------+
The expected behavior is for migration_2 to result in an error. However, if we perform a single rollback and run the migration again, then the correct behavior is encountered (see below). This is because now migration_2 is running in isolation without the benefit of the variable leak from migration_1.
Error: Error 1048: Column 'name' cannot be null
Problem Location
Here's location of the problematic function: https://github.com/amacneil/dbmate/blob/cdbbdd65ea149ff9acf4342a136139adbbf33115/pkg/dbmate/db.go#L317
Solutions
A few ways to fix this include:
- Close and reopen the database connection between each migration script
- Provide a new CLI command
step
which will run only a single migration script and exit
The second option might be the most backwards compatible approach, since some folks may be dependent (perhaps unwittingly?) on this behavior.
Wouldn't closing and reopening the connection affect the transactional execution though?
Sure, but a single DDL statement in your migration scripts will have the same effect, no? If you start a transaction, insert some records, and follow that up with a create
statement then the changes will be committed. So I don't think dbmate can make any guarantees on the effectiveness of running anything in a transaction, unless it starts enforcing a separation between DML and DDL migrations.
I think the fresh connection for each migration solution sounds ok. Are there any other downsides to this approach?
We already open a fresh transaction for each migration, so I don’t think that is a blocker.
FYI - last time I checked, Postgres supports transactional DDL for most statements, but mysql does not.
Ah yes, it looks like you're correct regarding Postgres DDL transactions: https://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis
The only other negative impact I can think of is the performance implications for users with a large number of migration scripts. I'm assuming that creating a new connection will take a bit more time than simply reusing an existing one.
Perhaps the behavior of closing and reopening the connection can be controlled by a new cli flag? This way the default behavior is the same as before, meaning that the feature is backwards compatible. This way postgres users are not affected, and folks that want the new behavior can set the new flag. Say something like --multi-session
?
Just adding a note here so I don't forget:
At least for the Postgres driver, consider issuing a RESET ALL
or DISCARD ALL
in between migrations if the connection will be reused. I think this will avoid the network connection churn of establishing a new database connection for each migration, but clean up the state of the connection between migrations, which is the desired effect.
For MySQL, issuing a COM_RESET_CONNECTION
between migrations may be a viable solution, except the underlying go-sql-driver/mysql driver we use hasn't implemented support for sending it.
This is a particularly gnarly issue for SQLite, because someone may want to use an in-memory database to test their migrations, and if we close and reopen the database in between migrations, each migration will be applied to a completely new in-memory database, which is not desirable. However, when testing against file-backed databases, it's important to reset session state between migrations to ensure that they are atomic.