add support for hash scheduling
This PR adds support for Jenkins-style hash schedule spec such as
0 0 H * * *
H is replaced by something derived from job name and moduled to its position within the spec (such as H mod 60 for minutes but H mod 24 for hour).
This allows for easily scheduling many daily jobs with same cron spec while still mildly balancing them throughout the day.
I tried to do it directly in the dkron Scheduler but it gets tricky due to the need of a second parameter that is not part of cron.Scheduler interface.
Also, the hash itself is actually just summing up the character values for simplification. It could be improved with an actual hashing method with less collisions and better distribution but in the end it would still use modullus 7, 12, 24, 60 causing many collisions.
As this is not for security but only mildly load balancing, I left as is.
Let me have feedback!
I'll try and have a closer look this weekend, but at the very least I'd like to see some accompanying unit tests. Also, please have a look at the discussion in #866 in case you haven't yet.
Hi @yvanoers, I had not seen that PR no, but (as mentioned in the original description) Jenkins was my motivation indeed.
This is actually a PR I've had in my ToDo list since I started using dkron 2 years ago, as I used H all the time in Jenkins and missed it.
Was there any highlights/caution you wanted me to notice there? My understanding, now that I read it, is that everyone agreed with the approach.
Definitely missing unit tests, I'll take care of those
Note: I'm already using this in my fork, so far so good!
Hi @fopina ! The noteworthy bit of the feature request is the intent to have this functionality added to the cron library - that's where it should belong. There ought to come a PR having the same behavior as this PR for that project, so we can seemlessly move to a new version of cron when (and if) it'll get merged. I can't make you do that, of course, but it would be much appreciated.
ah got it @yvanoers !
I think it would make sense to make it available everywhere though… cron library cares about schedule nothing else.
There’s not enough entropy in the schedule string to use it as the hash meaning a new non-sense parameter would be added to the library method just so it would do the “strings.replace” anyone can easily do before calling it.
So even though I think it would make sense, I don’t think it does in the end 😄
but happy to agree to disagree and closing this PR!
... and closing this PR!
There are no high expectations of this getting included in the cron scheduler. And we do want to support it in Dkron. So no need to close this PR prematurely.
Will finish code review hereafter.
Thank for taking the time for explaining @yvanoers , I do understand your point much better now.
Still I'm not sure how to introduce the hashable parameter without making it look nonsense. But I'll definitely take a look first then!
I was looking into updating my fork 3.2.3 to latest 3.2.7 and couldn't remember why I had a fork, only to go through history and find this feature was the reason 🤦
I didn't look into opening PRs to cron scheduler but re-reading this thread I believe it's agreed it would be unlikely to happen anyway.
Should I try to re-implement this with something other than H as the hash token? Maybe something more unique that we wouldn't have to parse schedule in greater detail?
Or simply going with a regexp \bH\b?
Maybe something more unique that we wouldn't have to parse schedule in greater detail?
That may be worth exploring. 'H' is an unofficial addition anyway, there's no absolute need to adhere to it. '~' perhaps?
To support hash scheduling strategies, should we consider implementing a new nodeSelector? Then introduce a new configuration item to switch the scheduling strategy? The default nodeSelector strategy is random dispatch.
https://github.com/distribworks/dkron/blob/f114610b9002d21a1d959be92e356cb02cda5991/dkron/run.go#L33
https://github.com/distribworks/dkron/blob/f114610b9002d21a1d959be92e356cb02cda5991/dkron/agent.go#L757-L760
'~' perhaps?
@yvanoers thank you for all the feedback. I've also updated H to ~ while going through the rest of the review comments.
Hope it looks better now!
Sadly, I couldn't reply to review comments within a single review (sorry for the spam), because of the force push I had to do when switching the PR to v4, but it felt like a required step as well ⚡
Also added a small unit test
@yvanoers any chance for review? :)