Feature Request: Schedule Jitter
I might (and often do) have the following requirements:
- I want a job to run every 30 minutes.
- I don't care what the actual time of the execution is, i.e. I don't want/need every such job to run on the 0th and 30th minute of each hour. 14/44 or 22/52 would be fine.
- I'd really prefer that such jobs get spread out.
Prior art: Jenkins has this via a "special" letter H. I didn't realize how useful this was until I met Jenkins, and now I think it should be everywhere. The two basic formats supported are H and H/N for any N:
Hpicks any value in the range, e.g. 0-59 for minutes/seconds, 0-23 for hours, etc.H/Ndescribes an interval, e.g.H/15could be 13/28/43/58 for minutes/seconds,H/8could be 3/11/19 for hours, etc.. IMHO it should be highly recommended to use even divisors, and in their absence I think it's reasonable to not support "rollover memory", i.e. I don't thinkH/31really needs to worry about firing at 00:00, 00:31, 01:02, 01:33, 02:04, 02:35, etc.
For expressions to be calculated reproducibly, they would need a seed. In the HangFire context, that's probably something like a recurringJobId. I'm not sure how you'd want that detail to cross project boundaries -- if you can provide some guidance on that, and this feature sounds like a good idea to you, I'm happy to work/collaborate on this.
If it's not something you'd want to support natively, I'd love any suggestions you might have on how this could be implemented from the outside.
I agree this feature would be very useful. However, as far as I understand, Jenkins provides a stable value when creating such a cron expression, for example, 0 H * * * expression is created, and a value for H is chosen (for example, 23), it remains the same in the future. Am I right?
That's right. That's why I was referencing the need for a stable seed, which is the basic approach Jenkins uses.
Yes, in this case I think it's a good idea to implement this in Cronos itself, because it knows all the internal details of cron expressions. I think we can modify the diagram as follows to introduce the H special character (and may be refactor it later by merging step-related options to have them in one place), pass the seed to the CronExpression.Parse method directly by creating an overload and implement a new method that uses it when parsing the cron expression. It will simply use a random number based on this seed, when calculating bits.
cron ::= expression | macro
expression ::= [second space] minute space hour space day-of-month space month space day-of-week
second ::= field
minute ::= field
hour ::= field
day-of-month ::= '*' [step] | '?' [step] | 'H' [step] | lastday | value [ 'W' | range [list] ]
month ::= field
day-of-week ::= '*' [step] | '?' [step] | 'H' [step] | value [ dowspec | range [list] ]
macro ::= '@every_second' | '@every_minute' | '@hourly' | '@daily' | '@midnight' | '@weekly' | '@monthly'|
'@yearly' | '@annually'
field ::= '*' [step] | '?' [step] | 'H' [step] | value [range] [list]
list ::= { ',' value [range] }
range ::= '-' value [step] | [step]
step ::= '/' number
value ::= number | name
name ::= month-name | dow-name
month-name ::= 'JAN' | 'FEB' | 'MAR' | 'APR' | 'MAY' | 'JUN' | 'JUL' | 'AUG' | 'SEP' | 'OCT' | 'NOV' | 'DEC'
dow-name ::= 'SUN' | 'MON' | 'TUE' | 'WED' | 'THU' | 'FRI' | 'SAT'
dowspec ::= 'L' | '#' number
lastday ::= 'L' ['-' number] ['W']
number ::= digit | number digit
space ::= ' ' | '\t'
Would you like to create a PR for this?
I would love to take a crack at it, and I appreciate the quick feedback. I need a week or two before I can get started due to some other commitments, but I'll give you a heads up once I pick it up.
Two questions, as I've started to look a little deeper into the codebase:
Macros
What are your thoughts on macros using jitter by default, e.g. @hourly becomes H * instead of 0 *? Personally, I'm a huge fan, but it's a breaking change (both from current behavior and Vixie cron) and I don't know how you feel about either of those. This would put it inline with Jenkins behavior, FWIW.
I could:
- Change the definitions of the existing macros, which would be a breaking change. Callers who want the prior behavior could just translate
@hourlyback to0 *, though, so it's not a huge loss. - Support new macros with a
_distributedsuffix, e.g.,@hourly_distributed(or similar), leaving the others alone. I think jitter as a default behavior is preferable, but at least this offers support without being a breaking change. - Support new macros with a
_zerosuffix, e.g.,@hourly_zero(or similar) to represent current behavior, and change the definitions of the existing macros to include jitter. Again, a breaking change, but in my mind a better default experience for most use cases.
Feel free to noodle on it, it'll still take me a while before I'm able to get to that part of it.
Jitter evaluation time
I also wanted to confirm your direction under this guidance:
pass the seed to the CronExpression.Parse method directly by creating an overload and implement a new method that uses it when parsing the cron expression
We can evaluate the jitter at parse time, effectively turning H H * * * into 38 11 * * * for some example seed. The ToString() would then reflect not the original expression but its evaluated outcome for the seed it was passed at the time (that it has forgotten). This limitation seems to be already present for steps, whose original definition is already lost, too.
The alternative is to leave knowledge of the ask for jitter in the expression as a CronExpressionFlag so that we could continue to use the H in the ToString(), and then calculation would take effect at GetNextOccurrence time. I prefer retaining the knowledge of jitter inside the expression, but I think it represents a significant departure from the design where each CronField basically just boils down to a bit set of enabled values, and my read of the code
Of course, the Dumbest Thing That Could Possibly Work to support parse-time evaluation with accurate ToString() would be to store the original expression inside the field as a string and just use that directly for ToString() purposes -- not sure if you'd be OK with the additional memory usage on that front, since this lib is clearly paying attention to perf, but it would fix the issue for steps, too.
Thoughts?
What are your thoughts on macros using jitter by default, e.g. @hourly becomes H * instead of 0 *? Personally, I'm a huge fan, but it's a breaking change (both from current behavior and Vixie cron) and I don't know how you feel about either of those. This would put it inline with Jenkins behavior, FWIW.
I think we should throw an exception, when cron expression contains the H character, but no seed is provided to avoid providing false expectations. Alternatively, it's possible to substitute H with 0 when there's no seed, and in this case we can update macros as well, but I'm not sure about this change. I like the idea with throwing an expression, because in this case we will be able to change the decision later without producing breaking changes.
What do you think?
We can evaluate the jitter at parse time, effectively turning H H * * * into 38 11 * * * for some example seed. The ToString() would then reflect not the original expression but its evaluated outcome for the seed it was passed at the time (that it has forgotten). This limitation seems to be already present for steps, whose original definition is already lost, too.
I think that ToString return value can indicate the actual values that will be used for H, and there's no need to match the original cron expression. It's very difficult to re-construct the original one, and it's always possible for a user to also store the original expression string when required.
Maybe there are existing open-source implementations in other languages for this feature.
I think we should throw an exception, when cron expression contains the H character, but no seed is provided to avoid providing false expectations.
The more I thought about it, the more I came to that conclusion, too. This lib isn't really in a position to log/warn.
Alternatively, it's possible to substitute H with 0 when there's no seed, and in this case we can update macros as well, but I'm not sure about this change. I like the idea with throwing an expression, because in this case we will be able to change the decision later without producing breaking changes.
Couldn't we mix and match this such that
- If you use
Hwithout a seed, wethrow. New behavior, non-breaking. - If you use an
Hwith a seed, we schedule the jitter. New behavior, non-breaking. - If you use a macro without a seed, we schedule without jitter. Existing behavior, non-breaking.
- If you use a macro with a seed, we schedule the jitter. New behavior, non-breaking.
Arguably, you as a caller asked for it. I think I like this behavior best, but am happy to think it over a little more. It's still opt-in but with the most minimal amount of effort.
To continue the thought experiment, I would be primarily consuming these changes through HangFire, where something like recurringJobId.GetHashCode() would be what I'd guess would be the most likely seed. How would you think of exposing the ability to opt into jitter from a Hangfire standpoint?
Would you want something universal like a bool RecurringJobOptions.UseJitter = false that covers those 4 cases, throwing in the first use case above if not explicitly set to true? It would easily cover the macro use cases in a cohesive fashion.
You could go further and auto-enable it when cronExpression.Contains('H'). I think I like that best as a HangFire user:
Hwithout a seed isn't really possible from the HangFire DX, but it's an illogical request thatthrows from Cronos's perspective.Hwill always come with a seed via HangFire, and Cronos does the right thing.- Macro on its own with the default of
UseJitter = falsefrom the HangFire perspective is completely backwards-compatible and doesn't violate anyone's expectations. Macro without a seed would be the same comments from Cronos's perspective. - Macro with
UseJitter = truedoes the right thing and is easy to opt into from HangFire's perspective; same comments from Cronos's lens.
So yeah, I think that's what I'd like to see. Sharing the full thought process out loud, please critique wherever you feel like it.
I think that [for]
ToString... there's no need to match the original cron expression.
I took a look through how HangFire interacts with Cronos, and it dawned on me that the caller who invokes CronParser.Parse already has the original, they don't need it back from Cronos, so yeah, I agree.