solid_queue icon indicating copy to clipboard operation
solid_queue copied to clipboard

Job with concurrency key blocked when it shouldn't have, eventually was run after `duration` expired

Open leondmello opened this issue 1 year ago • 14 comments

We are observing that our jobs are sometimes unnecessarily blocked. In the following example, for the same concurrency key, there are only 2 jobs. The first one completes immediately in just over a 100 milliseconds (finished_at value). The second one takes over 30 minutes, we are guessing it is blocked by the concurrency control duration. The scheduled_at time of the second job is much after when the first job completes. As per our understanding, the second job should have started immediately. We confirmed that none of the jobs processes bounced during that time.

... indicates redacted data.

Concurrency Config

limits_concurrency to: 1, key: ->(action, params) do
  ...
end, duration: 30.minutes

Queue Config

dispatchers:
    - polling_interval: 1
      batch_size: 500
workers:
    - queues: [<queue of blocked job>, "*"]
      threads: 3
      processes: <%= ENV.fetch("JOB_CONCURRENCY", 1) %>
      polling_interval: 0.1

Console queries

SolidQueue::Job.where("DATE_PART('minutes', finished_at - scheduled_at) >= 30").count
=> 1

SolidQueue::Job.where(concurrency_key: '...').select(:active_job_id, :scheduled_at, :finished_at).order(:finished_at)
=>
[#<SolidQueue::Job:0x00007f062d151a50
  active_job_id: "60916418-e7a3-4f1e-9171-357b6e52c154",
  scheduled_at: "2024-12-16 01:20:09.441199000 +0000",
  finished_at: "2024-12-16 01:20:09.648985000 +0000",
  id: nil>,
 #<SolidQueue::Job:0x00007f062d151910
  active_job_id: "3c4f34da-9db7-430e-9cbe-c1214f98ccb9",
  scheduled_at: "2024-12-16 01:54:49.287841000 +0000",
  finished_at: "2024-12-16 02:24:50.264702000 +0000",
  id: nil>]

Logs about the second job execution

I, [2024-12-16T02:24:50.154361 #84]  INFO -- : [ActiveJob] [...] [3c4f34da-9db7-430e-9cbe-c1214f98ccb9] Performing ... (Job ID: 3c4f34da-9db7-430e-9cbe-c1214f98ccb9) from SolidQueue(...) enqueued at 2024-12-16T01:54:49.287885003Z with arguments: ...
I, [2024-12-16T02:24:50.264573 #84]  INFO -- : [ActiveJob] [...] [3c4f34da-9db7-430e-9cbe-c1214f98ccb9] Performed ... (Job ID: 3c4f34da-9db7-430e-9cbe-c1214f98ccb9) from SolidQueue(...) in 110.35ms

leondmello avatar Dec 16 '24 21:12 leondmello