pg_ha_migrations icon indicating copy to clipboard operation
pg_ha_migrations copied to clipboard

Add a configuration option for always disabling ddl transactions

Open ChrisVoxland opened this issue 11 months ago • 7 comments

This adds a configuration option: always_disable_ddl_transactions to allow folks to easily disable the default behavior of making all migrations run without a DDL transaction.

The reasoning for that as the default makes sense, but I found it violated the principle of least surprise when I was first experimenting with this gem, and it was simple enough to make it configurable.

The existing warnings for concurrent migrations in a DDL transaction seem safe enough to me, and the tradeoff of potential locking with multiple DDL statements in a single transaction seems worth it for those who want to opt into that behavior.

ChrisVoxland avatar Feb 28 '24 22:02 ChrisVoxland

@rkrage Any thoughts on this? Would be spiffy to get it in if possible.

ChrisVoxland avatar Mar 04 '24 17:03 ChrisVoxland

@ChrisVoxland can we try to come up with a better name for the config? If I'm understanding this correctly, this setting only affects the behavior of disable_ddl_transaction if you don't explicitly define it in your Rails config. So I think I'm mostly getting hung up on the always prefix

rkrage avatar Mar 08 '24 17:03 rkrage

@rkrage not attached to that name by any means. How about disable_ddl_transactions_by_default? Open to any suggestions here as well.

Also if there is a way to do this without code changes that I'm missing, I'd be happy to put up a PR updating the docs to describe that more specifically instead. I tried a few different variations of things with config/initializers, but wasn't actually able to get the default behavior back, but I very well may be missing something obvious.

ChrisVoxland avatar Mar 13 '24 16:03 ChrisVoxland

@ChrisVoxland I think disable_ddl_transactions_by_default makes sense

Also if there is a way to do this without code changes that I'm missing, I'd be happy to put up a PR updating the docs to describe that more specifically instead.

I thought there was already a global config for toggling this behavior in Rails but I don't think that actually exists and it can only be turned off on a per migration basis. So it appears the only way to change the default behavior is by monkey patching the migration class, which is what we're doing here 😄

So yeah, this PR is definitely a welcome change

rkrage avatar Mar 22 '24 14:03 rkrage

@rkrage Thanks!, I updated the pr to use the new name, lmk if there is anything else I can do to get this through.

ChrisVoxland avatar Mar 22 '24 16:03 ChrisVoxland

@ChrisVoxland I don't have any issues philosophically with adding this kind of configuration, but I do think we have one thing we need to address to be able to merge: some DDL (e.g., safe_add_concurrent_index and safe_validate_check_constraint and a few others) can't be run inside a transaction.

I think the simplest solution would be to:

  1. Have those methods check if transactions are enabled, and, if so, immediately raise an invalid migration error.
  2. Ensure the docs (where you already changed them) note that there are some methods which will require you to turn transactions off if you enable them by default.

jcoleman avatar Mar 22 '24 17:03 jcoleman

@jcoleman I'd be happy to update the docs, but I don't think any other code changes should be necessary here. safe_add_concurrent_index will already fail with a descriptive error if it runs in a transaction; Postgres already guards against running DDL in transactions when it does not support this feature:PG::ActiveSqlTransaction: ERROR: CREATE INDEX CONCURRENTLY cannot run inside a transaction block.

And safe_validate_check_constraint is actually fine running inside of a transaction; I just ran a migration doing exactly that (though without your gem) on a table with over 100 million records last month in production.

Also happy to dig in and investigate if there are other methods of concern here.

ChrisVoxland avatar Mar 22 '24 20:03 ChrisVoxland

I think I agree with @ChrisVoxland on this one.

Have those methods check if transactions are enabled, and, if so, immediately raise an invalid migration error.

Is it really worth adding extra validation to avoid one extra call to Postgres (that will fail immediately with a pretty descriptive error)?

In both cases, whether we raise the error ourselves or let it bubble up from Postgres, the result will be the same: a rolled back transaction.

@jcoleman what do you think?

rkrage avatar May 31 '24 18:05 rkrage

Hi @ChrisVoxland! I took a pass at this and reworked a bunch of documentation on my branch

There are a few concerns:

  1. Given the trade-offs listed in the docs I added, I'm not convinced this is desirable, though I haven't yet added a section of trade-offs: i.e., if you're only looking to avoid rewriting the table, this could make sense, as opposed to being sensitive to how long a lock is held.
  2. More importantly, I added some tests, and I don't think the feature does what you want: A. First, it doesn't appear a transaction gets opened at all. B. If a transaction were to be opened, and lock acquisition didn't succeed on the first attempt, then the whole migration will fail (maybe that's worth it to someone, but it's not at all obvious). You lose the safely_acquire_lock_for_table benefits and looping.

For more on that last point see this manual demo in psql:

Session 1:

test=# begin;
BEGIN
test=*# lock table items;
LOCK TABLE
test=*# -- let hit hang here

Session 2:

test=# set lock_timeout = 2;
SET
test=# begin;
BEGIN
test=*# create table others (i int);
CREATE TABLE
test=*# lock table items;
ERROR:  canceling statement due to lock timeout
test=!# select 1;
ERROR:  current transaction is aborted, commands ignored until end of transaction block
test=!# rollback;
ROLLBACK

Thoughts?

jcoleman avatar Jun 04 '24 21:06 jcoleman

First, it doesn't appear a transaction gets opened at all.

I believe this was just a problem with our test setup. See #101

rkrage avatar Jun 06 '24 17:06 rkrage

But back to the original conversation. I'm starting to think that we shouldn't allow this at all...

If a transaction were to be opened, and lock acquisition didn't succeed on the first attempt, then the whole migration will fail

But if there are blocking transactions, that part of safely_acquire_lock_for_table will loop until all long running queries finish.

So here's a concrete example of one of the risks @jcoleman called out in the docs:

Acquired locks are held until the end of the transaction.

class MyTransactionalMigration < ActiveRecord::Migration[7.1]
  self.disable_ddl_transaction = false

  def up
    safe_make_column_nullable :foo, :column
    safe_make_column_nullable :bar, :column
  end
end

We might successfully take out the lock on foo and quickly return from the first safe_make_column_nullable call, but if there is a long running query against bar we could be blocking for a very long time... and therefore foo would be locked for a very long time. That seems incredibly dangerous.

rkrage avatar Jun 06 '24 21:06 rkrage

Hey folks, sorry for the delay in getting back to ya'll; very busy/hectic summer for me so far. I think that long locking example makes sense, and does go against what this gem is trying to achieve to a large degree. Going to close the PR, appreciate your time/feedback!

ChrisVoxland avatar Jul 26 '24 17:07 ChrisVoxland