sidekiq-unique-jobs
sidekiq-unique-jobs copied to clipboard
PRIMED key grows indefinitely when job re-queues itself in `after_unlock` callback
Describe the bug
PRIMED
key grows indefinitely for jobs that requeue themselves in after_unlock
callback if the job is completed and re-queued again within 1 second of when the initial job was queued.
Expected behavior
PRIMED
key is removed or stays at a small length when job is locked or unlocked.
Current behavior
PRIMED
key grows indefinitely, with "1"
as a value when job is re-queued within 1 second of initially being queued.
Worker class
class AfterUnlockWorker
include Sidekiq::Job
sidekiq_options lock: :until_executed
def perform(args)
@args = args
end
def after_unlock
self.class.perform_async(@args)
end
end
Additional context Sidekiq Version: 6.5.8 SidekiqUniqueJobs Version: 7.1.29
Rspec Example:
# frozen_string_literal: true
require "rspec"
require "sidekiq"
require "sidekiq-unique-jobs"
require "sidekiq_unique_jobs/testing"
Sidekiq.configure_client do |config|
config.client_middleware do |chain|
chain.add(SidekiqUniqueJobs::Middleware::Client)
end
end
RSpec.configure do |config|
config.before(:each) do
Sidekiq::Worker.clear_all
Sidekiq::Testing.server_middleware do |chain|
chain.add(SidekiqUniqueJobs::Middleware::Server)
end
end
end
RSpec.describe "PRIMED key spec" do
it "grows PRIMED key indefinitely" do
Sidekiq::Testing.fake!
AfterUnlockWorker.perform_async(1)
lock_digest = AfterUnlockWorker.jobs.first["lock_digest"]
primed_key_length = -> { Sidekiq.redis { |r| r.llen("#{lock_digest}:PRIMED") } }
primed_key_values = -> { Sidekiq.redis { |r| r.lrange("#{lock_digest}:PRIMED", 0, -1) }.uniq }
100.times do |i|
AfterUnlockWorker.perform_one
puts "Test #{i} - Length of PRIMED key #{primed_key_length.call}" if i % 10 == 0
end
puts "Test Finished - Length of PRIMED key #{primed_key_length.call}"
puts "Test Finished - Unique PRIMED key values #{primed_key_values.call}"
sleep(1)
puts "Test Finished - Length of PRIMED key after 1 second sleep #{primed_key_length.call}"
end
end
Output from running the above spec:
Test 0 - Length of PRIMED key 1
Test 10 - Length of PRIMED key 11
Test 20 - Length of PRIMED key 21
Test 30 - Length of PRIMED key 31
Test 40 - Length of PRIMED key 41
Test 50 - Length of PRIMED key 51
Test 60 - Length of PRIMED key 61
Test 70 - Length of PRIMED key 71
Test 80 - Length of PRIMED key 81
Test 90 - Length of PRIMED key 91
Test Finished - Length of PRIMED key 100
Test Finished - Unique PRIMED key values ["1"]
Test Finished - Length of PRIMED key after 1 second sleep 0
This looks to be because when unlock
LUA script on a job, it pushes "1"
to the QUEUED
key:
https://github.com/mhenrixon/sidekiq-unique-jobs/blob/main/lib/sidekiq_unique_jobs/lua/unlock.lua#L95-L96
I unfortunately don't have enough context around the locking/unlocking algorithm to know if this is actually necessary or not.
I will say that after doing some more reading, Redis suggests a much simpler algorithm for setting up Distributed locking for a single master redis instance (which I think most people are running).
I'd love to hear about the reasoning behind why the QUEUED
and PRIMED
keys were added.
Yes, redis suggests a simpler algorithm, but I needed more, so I dreamed up my own algorithm.
People requested that they lock only during runtime, not during queuing, meaning sidekiq picks the job up, runs it through the server middleware, and the server constrains any duplicates from running simultaneously.
People also requested that the locks stay active as long as the job is in Sidekiq. If you want something simpler, I recommend https://github.com/rwojsznis/sidekiq-lock or sidekiq-pro if you have the money.
I don't think you should schedule a new job as part of the old one still running. Sidekiq needs to be done processing the current job when you push the next one. Preferably, you should do that outside of this "Redis transaction" or "unit of work."
The queued and primed keys are used to block as part of processing the job. The push of "1" is accompanied by an expiration to ensure that listeners on the queued key can keep listening.
There are things I can do to improve your situation, but you know you are doing an infinite recursion, right?