caseflow icon indicating copy to clipboard operation
caseflow copied to clipboard

jcroteau/APPEALS-43428 - Replace `Caseflow::Migration` for concurrent index migrations on ActiveRecord 6.0+

Open jcroteau opened this issue 10 months ago • 1 comments

  • [ ] 🔴 Pre-release Note: Update Caseflow::Migration deprecation cut-off version to latest migration version before testing and release.

This PR is part of a 4 PR stack:

  1. https://github.com/department-of-veterans-affairs/caseflow/pull/21286
  2. [This PR]
  3. https://github.com/department-of-veterans-affairs/caseflow/pull/21453
  4. https://github.com/department-of-veterans-affairs/caseflow/pull/22181

Resolves https://jira.devops.va.gov/browse/APPEALS-43428

Description

Background

By default, Postgres locks writes (but not reads) to a table while creating an index on it. That can result in unacceptable downtime during a production deploy. On a large table, indexing can take hours.

(For further details on this problem and its workaround in Rails projects, you can refer to this article.)

In caseflow, we have a Caseflow::Migration utility class, whose intended purpose is to make it more convenient to add DB indexes concurrently (i.e. without locking write operations). This is often needed because we have some rather large tables, so building new indexes on them can take a long time.
 
It does this in two ways:

  1. It disables transactions during the migration (by overriding ActiveRecord::Migration #disable_ddl_transaction to always be true). This is necessary to prevent table locking.
  2. It defines a utility method to be used in place of add_index for adding the index concurrently (Caseflow::Migration #add_safe_index). This method prevents timeouts under 30 minutes and also performs a rollback should an error occur while adding the index (since transactions are disabled).

Since it turns off transactions, we should NOT be using this class for migrations in general, but only for those migrations that only add indexes.

  • There are many examples where we use it for making changes other than adding indexes, and these were likely misguided.
  • We should not be adding indexes and mixing in other database changes within the same migration.
  • We should be adding indexes concurrently in their own, isolated migrations.

Problems

  1. Our Caseflow::Migration class inherits from ActiveRecord::Migration[5.1]. Now that we are Rails 6.0 / ActiveRecord 6.0, all new migrations should inherit from ActiveRecord::Migration[6.0].
  2. There are numerous examples of past migrations that have used Caseflow::Migration inappropriately, either accidentally or due to developer negligence. This is potentially problematic, as this opens the door to failing migrations without transaction rollbacks.

Proposed Solution

  1. Introduce a module to supersede Caseflow::Migration, to be included in new migrations that add indexes concurrently.
    • Caseflow::Migration and past migrations that inherit from it will be left unchanged for posterity.
  2. Deprecate Caseflow::Migration for migrations AFTER version 20240227154315.
  • Running Caseflow::Migration for migrations BEFORE version 20240227154315 should still be successful.
  • Attempting to run a migration using Caseflow::Migration AFTER version 20240227154315 results in an instructive deprecation error. 
  1. Update the configuration for the StrongMigrations gem such that attempting to run a migration that adds an index non-concurrently (via add_index) fails with an instructive error message.

Acceptance Criteria

  • A new module exists to aid in migrations for adding indexes concurrently and which supersedes Caseflow::Migration
  • Attempting to run a migration using Caseflow::Migration BEFORE cut-off version is still successful.
  • Attempting to run a migration using Caseflow::Migration ON the cut-off version is still successful.
  • Attempting to run a migration using Caseflow::Migration AFTER cut-off version results in an instructive deprecation error.
  • Attempting to run a migration that adds an index non-concurrently fails with an instructive error message.

Testing Plan

  • See Jira Test https://jira.devops.va.gov/browse/APPEALS-43648

jcroteau avatar Apr 04 '24 18:04 jcroteau

Code Climate has analyzed commit 88656aec and detected 1 issue on this pull request.

Here's the issue category breakdown:

Category Count
Performance 1

View more on Code Climate.

codeclimate[bot] avatar Apr 04 '24 18:04 codeclimate[bot]

Changes to be released in https://github.com/department-of-veterans-affairs/caseflow/pull/22780

jcroteau avatar Sep 13 '24 19:09 jcroteau