dragonfly icon indicating copy to clipboard operation
dragonfly copied to clipboard

Sidekiq Pro's reliable scheduler requires breaking Redis Cluster behavior

Open mperham opened this issue 1 year ago • 7 comments

Redis Lua requires the user to define all keys touched by a Lua script when invoking that script in order to run correctly in Redis Cluster.

The problem is that Sidekiq Pro's reliable scheduler does not know which queue (list) it will touch next when invoked. And Redis allows us to violate that requirement since Sidekiq does not support Redis Cluster.

The Lua scheduler is a simple loop:

  • Take the first job off ZSet "scheduled" where score is < Time.now.to_f.
  • Parse the JSON to get the queue element.
  • Push the job onto that named queue.

As you can see, we know the Lua script will touch scheduled but we don't know the name of the queue because the scheduled set holds future jobs for all queues and the job data itself tells us which queue to mutate.

Unfortunately it looks like DragonflyDB enforces this Lua requirement, making Sidekiq Pro's reliable scheduler feature incompatible. Redis does allow it as long as it is not running in Cluster mode.

mperham avatar Jan 18 '24 20:01 mperham

@mperham Unfortunately, atomicity requirements do not allow Dragonfly to parallelize arbitrary lua code efficiently. We provide compatibility flags that allow running lua script with undeclared variables but this would kill performance of Dragonfly, since such lua script would run under Global Lock.

Can you please explain semantics of reliable scheduler? Why is there a global "scheduled" set if its items are distributed into different queues anyway? Why not have "scheduled" per queue, for example?

romange avatar Jan 19 '24 04:01 romange

The use of a single, global scheduled zset is simply one of legacy. That's how it was originally written 10+ years ago before we even had the SCAN commands and it has never changed. There are 4-5 global structures which have proven a pain to scale because they don't have a Cluster-friendly data model.

mperham avatar Jan 19 '24 17:01 mperham

@mperham Is atomicity required for this script? From your description of the script if the script is not executed from different clients I believe there is no atomicity requirement and if this is the case dragonfly can run a script in non atomic mode which will allow you to not declare the keys in advance.

adiholden avatar Jan 21 '24 08:01 adiholden

@adiholden consider the following scenario:

  1. worker A removes a job from "scheduled" with score 1, then it non-atomically tries to add it to Q1.
  2. worker B removes a job from "scheduled" with score 2, then it also non-atomically tries to add it to Q1.
  3. Worker B succeeds first and now the first job in Q1 has score 2, and the next one has score 1.

romange avatar Jan 21 '24 08:01 romange

@romange If this script runs by multiple clients (as in your example different workers), the atomicity is important

adiholden avatar Jan 21 '24 08:01 adiholden

@adiholden found this comment https://github.com/sidekiq/sidekiq/blob/80f5f73f8e74a5775866a016fe42446dfc1b861e/docs/internals.md?plain=1#L31

@mperham can you confirm that sidekiq-pro maintains at most one thread that calls this script?

romange avatar Jan 21 '24 08:01 romange

can you confirm that sidekiq-pro maintains at most one thread that calls this script?

One thread per process. You can have many Sidekiq processes, all calling the script concurrently. That's why it's written in Lua.

mperham avatar Jan 22 '24 16:01 mperham

The workaround that seemed to work for us script flags 351130589c64523cb98978dc32c64173a31244f3 allow-undeclared-keys

We should do it automatically within Dragonfly. No reason not to do it.

romange avatar Aug 06 '24 13:08 romange

@romange Ideally we could add a #!lua flags=allow-undeclared-keys to SideKiq Pro's scripts directly. This will be future proof for changes in SHAs. Downside is that we'll need to get @mperham's approval, and wait for their next release.

chakaz avatar Aug 06 '24 18:08 chakaz

I can do that. I was planning a patch release soon anyways.

mperham avatar Aug 06 '24 18:08 mperham

I can do that. I was planning a patch release soon anyways.

I synced with @romange offline. If you could do that, for all scripts that (may) access undeclared keys (are there any other than what's discussed here?), that's be great. I'm happy to review if that'd be helpful.

On our side, we'll add the ability to specify SHAs for which we'll use allow-undeclared-keys, mainly to support users until they upgrade their Sidekiq version

chakaz avatar Aug 07 '24 06:08 chakaz

If I add this as the first line of my Lua script:

#!lua flags=allow-undeclared-keys

then I get:

ERR Error compiling script (new function): user_script:1: unexpected symbol near '#' (redis://localhost:6379/14)

when running in Redis. Do you mean:

--#!lua flags=allow-undeclared-keys

mperham avatar Aug 07 '24 14:08 mperham

seems that Redis prevent using #!lua notation that it does not recognize. It's rather unfortunate

romange avatar Aug 07 '24 15:08 romange

Are script tags not backwards compatible with Redis then? I can't use it if it doesn't work transparently with Redis.

mperham avatar Aug 07 '24 15:08 mperham

I believe Redis uses Lua 5.1 and Dragonfly uses Lua 5.4. Was the # comment style added in the newer versions?

mperham avatar Aug 07 '24 15:08 mperham

@mperham yeah, it seems to be a miss on our side. Let me get back to you in a few days.

romange avatar Aug 07 '24 15:08 romange

Sorry, I accidentally closed this as done. #3465 only handles the Dragonfly side. We should still apply the preferred solution of annotating on the Sidekiq side. Reopening.

chakaz avatar Aug 11 '24 11:08 chakaz

@mperham we've made a decision. We modified our Lua parsing to use a Lua-comment style pragma, which will be fully compatible with Valkey / Redis. See more here: #3517 and #3512

May I kindly ask that you add the following to the beginning of your scripts?

--!df flags=allow-undeclared-keys,disable-atomicity

Support for this will only be released with the next version of Dragonfly, but that should not prevent you from putting these pragmas now :)

Please do let me know if there's any issue. Thanks!

chakaz avatar Aug 15 '24 06:08 chakaz

Done and released in Sidekiq Pro 7.3.1.

mperham avatar Aug 15 '24 15:08 mperham