sidekiq-throttled
sidekiq-throttled copied to clipboard
Unthrottled keys get stuck behind big queues
Environment:
gem 'sidekiq', '6.1.2' gem 'sidekiq-throttled', '0.13.0'
How to reproduce:
Set sidekiq concurrency to 25
class MockWorker
include Sidekiq::Worker
include Sidekiq::Throttled::Worker
MY_OBSERVER = lambda do |strategy, *args|
puts "@@@@ THROTTLING #{strategy} #{args}"
end
sidekiq_throttle({
:observer => MY_OBSERVER,
:concurrency => {
:limit => 1,
:key_suffix => Proc.new{|acc_id| acc_id }
}
})
def perform(account_id)
puts "===> #{Time.now.to_s} Starting job (account: #{account_id})"
sleep 10
puts "===> #{Time.now.to_s} Finished! (account: #{account_id})"
end
end
and then try to enqueue 100 jobs with the same account_id, and another single job with a different account_id:
100.times {|n| MockWorker.perform_async(1) }
MockWorker.perform_async(2)
Behavior
You will notice that account 2's job stays enqueued for a while, upto 30-60 seconds even though its key shouldn't be throttled at all. basically when the job queue gets to 50-60 you start noticing latency on jobs that shouldn't throttle.
I could imagine it to be a fetch size issue where sidekiq or the throttler doesn't get to the job of account 2 until later and never has a chance to decide whether it should throttle or not.
Keep in mind i have even tried to reduce the poll interval to as little as 1 second, with no luck:
Sidekiq.configure_server do |config|
config.average_scheduled_poll_interval = 1
end
Anyone is experiencing this?
Bump
Bump
Bump!
bump
@spazer5 Here's an old explanation of how throttling current works in this library from one of the authors
tl;td:
It gets pushed back to the end of the queue it was retrieved from. And that queue is removed from the queues to poll for 2 seconds.
https://github.com/sensortower/sidekiq-throttled/issues/52#issuecomment-412911238
It seems that this behavior might be affecting our throughput on some shared queues, so we are looking to better understand this behavior too. It seems that this PR #80 in the 0.12.0 release allows consumers to define for how long the queue should pause, so I assume one could set this parameter to 0 to "disable" that behavior and potentially speed up processing. Haven't tested that myself to validate it nor looked closer at the Redis implications of it.
@ixti Has something changed from the scope above?
Just a warning for anyone who stumbles on this in the future: if you set the cooldown to 0 and all of the jobs in your queues are throttled, Sidekiq will pop throttled jobs as fast as possible and re-enqueue them (since they are throttled) which can cause high cpu load on the Redis/Sidekiq servers. I'm assuming the cooldown was added to prevent thrashing like this.
Yes, cooldown was added to avoid thrashing redis. But I'm thinking on a better dynamic (based on statistics of) skips, will be part of 1.0.0 release
I have completely removed cooldown part in v1.0.0.alpha; Will introduce a simple way to have it back to those who will actually need it.