timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

Problem: Passing an ActiveRecord::Relation to `create_continuous_aggregate` in a migration will execute the query

Open exterm opened this issue 7 months ago • 2 comments

We have what we think is a pretty standard migration that times out in production:

def up
  # Using ActiveRecord unscoped to bypass default scope with Current.organization
  query = Events.unscoped
    .select(
      # The `time_bucket` in the query defines the smallest interval for data aggregation (1h)
      "time_bucket(INTERVAL '1 hour', received_time_utc) AS bucket,
      device_id,
      organization_id,
      MAX(received_time_utc) AS max_received_time"
    )
    .group("bucket, device_id, organization_id")

  options = {
    materialized_only: false, # real-time aggregate
    create_group_indexes: true, # https://docs.timescale.com/use-timescale/latest/continuous-aggregates/create-index/#automatically-created-indexes
    finalized: true, # the "new" format, https://docs.timescale.com/use-timescale/latest/continuous-aggregates/migrate/#migrate-a-continuous-aggregate-to-the-new-form
    refresh_policies: {
      start_offset: "INTERVAL '1 day'", # how far back timescaledb looks for when refreshing data => from 1 day ago
      end_offset: "INTERVAL '1 hour'", # when it stops => up to 1h ago
      schedule_interval: "INTERVAL '1 hour'" # how frequently the data is recalculated => once every hour
    }
  }
  create_continuous_aggregate(:events_latest_time_agg, query, **options)

  # dropping entries from aggregate after 1 week
  create_retention_policy(:events_latest_time_agg, drop_after: "7 days")
end

Now why would this time out in production? create_continuous_aggregate runs WITH NO DATA by default, right?

Well it took me a while to get to the bottom of this.

It's due to how Rails logs schema statements during migrations.

Look at this log output for the above migration:

-- create_continuous_aggregate(:events_latest_time_agg, #<ActiveRecord::Relation [
  #<Event bucket: "2025-04-19 06:00:00.000000000 +0000", device_id: "571cfa0a-5b53-432c-a441-dc65654c0c92", organization_id: 1, max_received_time: "2025-04-19 06:58:11.804000000 +0000">,
  #<Event bucket: "2025-04-20 02:00:00.000000000 +0000", device_id: "c83b47a8-0424-48a8-8a7e-41c095b6723d", organization_id: 1, max_received_time: "2025-04-20 02:59:52.289000000 +0000">,
  #<Event bucket: "2025-04-19 15:00:00.000000000 +0000", device_id: "65af0e1e-2b4f-45d7-9a2a-ae36327f13d8", organization_id: 1, max_received_time: "2025-04-19 15:59:29.780000000 +0000">,
  #<Event bucket: "2025-05-01 14:00:00.000000000 +0000", device_id: "e826599f-977a-4ac7-9d2c-32afa9b8d80d", organization_id: 1, max_received_time: "2025-05-01 14:59:12.970000000 +0000">,
  #<Event bucket: "2025-04-19 07:00:00.000000000 +0000", device_id: "78e86010-c495-4bb4-bf65-dd00a3fc3109", organization_id: 1, max_received_time: "2025-04-19 07:59:01.279000000 +0000">,
  #<Event bucket: "2025-04-19 08:00:00.000000000 +0000", device_id: "7522e94f-6db0-4381-ad64-df04ce5ed514", organization_id: 1, max_received_time: "2025-04-19 08:58:19.505000000 +0000">,
  #<Event bucket: "2025-04-19 21:00:00.000000000 +0000", device_id: "96c584ef-fd6b-4284-b48f-2e70121880be", organization_id: 1, max_received_time: "2025-04-19 21:59:19.072000000 +0000">,
  #<Event bucket: "2025-04-19 12:00:00.000000000 +0000", device_id: "2ea542ae-d576-4e3a-be81-64ee7e992d54", organization_id: 1, max_received_time: "2025-04-19 12:58:04.081000000 +0000">,
  #<Event bucket: "2025-05-01 12:00:00.000000000 +0000", device_id: "36202b8e-7300-4d7c-ad5a-475807b3b326", organization_id: 1, max_received_time: "2025-05-01 12:58:06.169000000 +0000">,
  #<Event bucket: "2025-04-17 23:00:00.000000000 +0000", device_id: "fa5442ba-e8c5-44cb-9078-a8034aecd08e", organization_id: 1, max_received_time: "2025-04-17 23:58:39.450000000 +0000">, ...]>,
  {materialized_only: false, create_group_indexes: true, finalized: true, refresh_policies: {start_offset: "INTERVAL '1 day'", end_offset: "INTERVAL '1 hour'", schedule_interval: "INTERVAL '1 hour'"}})

Rails logs the schema statements called, including their parameters. To log the query parameter, it calls query.inspect, which executes the query. inspect will only fetch the first 11 buckets, but just calculating one bucket can take a long time for some devices.

So, the fixed version of the migration just replaces

create_continuous_aggregate(:events_latest_time_agg, query, **options)

with

create_continuous_aggregate(:events_latest_time_agg, query.to_sql, **options).

Change Request

Although this will only happen in rare cases, it is an excellent footgun. I suggest to at least mention this problem in documentation, or, even better, change create_continuous_aggregate to only ever take string queries. It is not too much to ask of the user to call to_sql themselves, and it prevents this problem from happening.

exterm avatar May 01 '25 19:05 exterm

Thanks a lot for the feedback @exterm. Let's build a validation of the argument type and avoid such scenarios.

jonatas avatar May 05 '25 13:05 jonatas