maintenance_tasks CSV task fails if uuid is used for active_storage_attachments.record

Background

We use uuid for primary keys, thus active_storage_attachments.record_id is uuid. On the other hand, maintenance_tasks_runs.id is bigint. Therefore, when a CSV task tries to upload a CSV as an ActiveStorage::Attachment, the attachment fails to reference the CSV task's run record (MaintenanceTasks::Run) because of the type inconsistency.

Possible solutions

Document that the generated migration file must be manually edited when uuid is used.
Respect g.orm :active_record, primary_key_type: :uuid when generating the migration file.

Thanks in advance.

Jul 29 '21 05:07 shouichi

Unfortunately simply changing maintenance_tasks_runs.id to uuid did not work because of the following code.

https://github.com/Shopify/maintenance_tasks/blob/e64cdf0c265f67dd3a487c5fb0db8dc9df5464d4/app/models/maintenance_tasks/task_data.rb#L44

Jul 29 '21 07:07 shouichi

Thanks for the report. Indeed we'd want to support primary_key_type: :uuid, but not sure how to deal with this query yet.

Jul 29 '21 14:07 etiennebarrie

The project only uses sqlite for testing. It seems that the first step is using both sqlite and postgres for testing. Is this the right direction @etiennebarrie ?

Aug 04 '21 04:08 shouichi

Yes, if it's necessary. I wonder if we could just test UUIDs with sqlite?

Aug 04 '21 12:08 etiennebarrie

Mmm, maybe we could use blob as a primary key and store uuid. I'll take a look.

Aug 05 '21 10:08 shouichi

I tried using string type as a primary key but it seems this comes with a lot of trouble because we cannot rely on database to generate ids. Personally testing both sqlite and postgress seems the way to go because it truly tests type: :uuid. @etiennebarrie What do you think?

Aug 11 '21 08:08 shouichi

Changed primary keys to uuid, ran tests, and found that the current code depends on monotonically increasing ID more than I thought. For example, Run.last or Run.where("id < ?", @cursor). It would be better to change incrementally. The first step should be to stop depending on monotonically increasing IDs. Does this make sense?

Aug 30 '21 09:08 shouichi

I'm not sure how which approach we should take:

stop depending on monotonically-increasing IDs
make all code dependencies on monotonically-increasing IDs abstract so we can change the behavior when using UUIDs, but keep the behavior as it is for existing users.

For the first solution: from your investigation, do we need a migration, or can we rely on the existing (task_name, created_at) index? It seems that would be the case for last run and pagination (runs#show), but for the query you mentioned above (https://github.com/Shopify/maintenance_tasks/issues/462#issuecomment-888856433), is that the case too?

If we can avoid a migration, then the first solution is great: we don't need two code paths for UUIDs. But if we need more work for existing users, I think the second solution would be preferable.

In these situations, I tend to do a spike so that I can see all the moving parts, even if I end up shipping change incrementally. I would favor having a branch where we can see everything we will need, even things aren't perfect or tests fail.

Aug 30 '21 12:08 etiennebarrie

Thanks for the advice.

If we can avoid a migration, then the first solution is great: we don't need two code paths for UUIDs. But if we need more work for existing users, I think the second solution would be preferable.

Honestly, I'm not sure if we can avoid migration or not.

In these situations, I tend to do a spike so that I can see all the moving parts, even if I end up shipping change incrementally. I would favor having a branch where we can see everything we will need, even things aren't perfect or tests fail.

OK. I'm going to open a PR that simply changes pk type to uuid so we can discuss it more in detail. Then, I'm also going to open several PRs against the first PR branch. So that PRs stay small (though the last merge might end up large).

Sep 01 '21 10:09 shouichi

I found myself in this situation and decided to hack it out with some monkeypatching and I thought I'd just offer this to anyone else who has a similar short-sighted need.

I'm using Zeitwerk overrides as described here: https://guides.rubyonrails.org/classic_to_zeitwerk_howto.html#decorating-classes-and-modules-from-engines

It uses ActiveRecordExtended for a window function and CTE because I was already using it, I'm sure we wouldn't want to pull this dependency into the gem but it could certainly be written without it.

# app/overrides/maintenance_tasks/run_override.rb

module MaintenanceTasks
  class Run < ApplicationRecord
    scope :last_completed_per_task, -> do
      window_cte = define_window(:ended_at_window)
        .partition_by(:task_name, order_by: {ended_at: :desc})
        .select(:id)
        .select_window(:row_number, over: :ended_at_window, as: :ended_position)

      completed
        .with(run_ended_positions: window_cte)
        .joins("JOIN run_ended_positions ON run_ended_positions.id = maintenance_tasks_runs.id")
        .where(run_ended_positions: {ended_position: 1})
    end
  end
end

# app/overrides/maintenance_tasks/task_data_index_override.rb

module MaintenanceTasks
  class TaskDataIndex
    class << self
      def available_tasks
        tasks = []

        task_names = Task.available_tasks.map(&:name)

        active_runs = Run.with_attached_csv.active.where(task_name: task_names)
        active_runs.each do |run|
          tasks << TaskDataIndex.new(run.task_name, run)
          task_names.delete(run.task_name)
        end

        last_runs = Run.with_attached_csv.last_completed_per_task.where(task_name: task_names)
        task_names.map do |task_name|
          last_run = last_runs.find { |run| run.task_name == task_name }
          tasks << TaskDataIndex.new(task_name, last_run)
        end

        # We add an additional sorting key (status) to avoid possible
        # inconsistencies across database adapters when a Task has
        # multiple active Runs.
        tasks.sort_by! { |task| [task.name, task.status] }
      end
    end
  end
end

The meat of the change above is substituting the old:

        completed_runs = Run.completed.where(task_name: task_names)
        last_runs = Run.with_attached_csv.where(id: completed_runs.select("MAX(id) as id").group(:task_name))

for the new scope:

        last_runs = Run.with_attached_csv.last_completed_per_task.where(task_name: task_names)

With no other changes

# db/migrate/###_maintenance_tasks_runs_uuid_pkey.rb

class MaintenanceTasksRunsUuidPkey < ActiveRecord::Migration[7.0]
  def change
    add_column :maintenance_tasks_runs, :uuid, :uuid, default: "gen_random_uuid()", null: false
    change_table :maintenance_tasks_runs do |t|
      t.remove :id
      t.rename :uuid, :id
    end
    execute "ALTER TABLE maintenance_tasks_runs ADD PRIMARY KEY (id);"
  end
end

And now I can run CSV tasks. No tests, no promises, works for me.

Nov 28 '22 22:11 dhnaranjo

Can we not refactor to use run timestamps in all of these cases? If someone is interested in issuing a PR where we use created_at and not id, I think that would solve both problems identified in this issue. There may be further issues, but IMO refactoring to not depend on incrementing IDs is better in the long run.

Jan 19 '24 05:01 gmcgibbon

This issue has been marked as stale because it has not been commented on in two months. Please reply in order to keep the issue open. Otherwise, it will close in 14 days. Thank you for contributing!

Mar 20 '24 01:03 github-actions[bot]

Any movement on this? We're still impacted.

Apr 01 '24 19:04 karnowski

Are you interested in contributing? I think with the task_name, status, created_at DESC index, we could change this query to use timestamps like @gmcgibbon said. The main blocker for us maintainers is that we don't use UUID as primary keys, so we're not really confident in contributing this, but as mentioned above, it's something we want to support since it's in Rails/Active Record.

Apr 02 '24 09:04 etiennebarrie

My team is currently pretty heads down, but we could potentially look at it later this year. If this issue gets closed as stale, can we reopen it later in the year when we're able to better contribute?

Apr 05 '24 17:04 karnowski

CSV task fails if uuid is used for active_storage_attachments.record_id