node-cron icon indicating copy to clipboard operation
node-cron copied to clipboard

The job stops if the time in the system changes at an unfortunate moment

Open psixdev opened this issue 5 years ago • 3 comments
trafficstars

Hi, I have been watching a server for quite some time, where sometimes the system time changes by a few seconds. Several times I came across a situation where one of the jobs stopped just at the moment of time change. I researched the library and found the problem: If during the execution of the getTimeout function (https://github.com/kelektiv/node-cron/blob/master/lib/time.js#L201) between the calculation of the sendAt time and the current time, the system time moves a few seconds forward, then it can it turns out that timeout is calculated negative and the job will stop with this line (https://github.com/kelektiv/node-cron/blob/master/lib/job.js#L190).

I created a pr in my repository fork, in which, using the synon, I simulated the time change at the right moment. My job starts every 5 seconds and I move the time 7 seconds forward, after which the job stops. https://github.com/psixdev/node-cron/pull/1/files

I understand that an accidental change in the time on the server is an unlikely event, but I believe that the problem really exists and it is serious. I suppose a similar bug can occur during the transition to daylight saving time (or vice versa), if this transition accidentally coincides with the call to the getTImeout function.

What do you think about this problem?

psixdev avatar Jul 29 '20 12:07 psixdev

hey I think this might be fixed by #639 because there's some code there that checks if the time changed unexpectedly and it runs the job if so. want to try running your example on that branch? if it still fails I think that branch adds some utility functions we can use to address this

intcreator avatar Feb 23 '23 18:02 intcreator

Hey, I noticed my PR was mentioned so I had a cursory glance at this issue.

I'm a bit pessimistic that it wouldn't fix this problem. The main detection method is in _getNextDateFrom is spotting gaps in DateTime's representation of time units. e.g. a Sudden jump from 2:00 to 3:00.

While sendAt() might call _getNextDateFrom() in this code block: https://github.com/kelektiv/node-cron/blob/d07333fec312b78c2e859791993b75696a06d4b4/lib/time.js#L183, These are system corrections instead of real gaps in DateTime itself. I don't think they would be picked up by adding / subtracting time to DateTime objects, since they would still behave.

Additionally, because my PR primarily thought about daylight savings time, it focuses on minute and hour changes. Adjustments on the order of seconds aren't really detected or scanned. (Though this is easier to adjust than the fundamental problem with the detection method, would definitely take a bit of extra changes though).

GijsvandenHoven avatar Feb 24 '23 00:02 GijsvandenHoven

One way to somehow evade the root cause is using a monotonic clock (https://stackoverflow.com/questions/46964779/monotonically-increasing-time-in-node-js) , but that's probably going to be very difficult to work with in combination with DateTime, because

The time returned from process.hrtime() is] relative to an arbitrary time in the past, and not related to the time of day and therefore not subject to clock drift.

Meaning it can't be used interchangeably with DateTime.

GijsvandenHoven avatar Feb 24 '23 01:02 GijsvandenHoven

there have been quite a few updates to the code since this issue was last brought up. is this still a problem? if not I will close the issue as stale.

intcreator avatar Feb 18 '25 04:02 intcreator

@intcreator I checked my test with the current version of the library. The problem is still there.

psixdev avatar Feb 21 '25 14:02 psixdev

Thanks so much for the validation @psixdev.

The fix I'm testing out for #805 is related to this issue. However, in its current (testing) version it will still stop the job if the difference is superior to 1s.

We must find a better way to handle timeout diffs than stopping the job. It doesn't make sense, especially since we don't throw a warning or provide any way to handle that event programmatically.

I will be working on a more solid approach over the weekend and keep this issue updated.

sheerlox avatar Feb 21 '25 14:02 sheerlox

Closing this issue in favor of #962, which regroups similar issues and includes a general bug description, root cause analysis, reproduction instructions, and a proposed fix.

Please join the discussion and share any thoughts or concerns you may have before we go about implementing the proposed solution.

sheerlox avatar Feb 23 '25 14:02 sheerlox