Cronicle icon indicating copy to clipboard operation
Cronicle copied to clipboard

Singleton events violation

Open fl4p-old opened this issue 4 years ago • 7 comments

The scheduler sometimes spawns multiple jobs for a single event with Concurrency 1 (Singleton). Although this almost never happens, it sometimes does and is unexpected behavior.

Screenshot 2021-02-12 at 11 25 25

fl4p-old avatar Feb 12 '21 11:02 fl4p-old

I think you just observing catch-up. If your server is down, and then up - the missing job will run on the next tick. But in case when next tick match the normal schedule, cronicle should launch 2 jobs one after another.

mikeTWC1984 avatar Feb 14 '21 02:02 mikeTWC1984

Mee to got violation, without Catch-Up

I have a unique instance:

image

The Event is configured as follow:

image

I manually start the job by API with a POST on /run_event/v1 passing the event ID

My CRONICLE version is Version 0.8.46.

Do you think is an already fixed issue and I must upgrade che installation?

Or the singleton constrain is not respected for API run_event by design?

P.S. I'm sure the two job runned in parallalel because the job move some file. In jobs log I can errors on files already moved by other job and vice versa.

ftaiolivista avatar May 25 '21 15:05 ftaiolivista

Just to clarify - are you saying cronicle keeps launching jobs as you hit run API?

mikeTWC1984 avatar May 25 '21 16:05 mikeTWC1984

I just tried to start a job singleton job hitting run API multiple time - everything works as expected (only 1 job is running)? Is that something you can reproduce consistently or it's sporadic?

mikeTWC1984 avatar May 25 '21 16:05 mikeTWC1984

I think I see the problem. You've got two overlapping jobs running on top of each other (both started at the same instant) and both lasted 3 seconds, with singleton set, which theoretically should not be possible.

I suspect the reason is because the API has no real locking going on. If you "flood" the API with requests, especially parallel ones, I'm sure you can sneak some jobs through.

I'll add this to the TODO list.

jhuckaby avatar May 25 '21 16:05 jhuckaby

@jhuckaby I suspect the same. @mikeTWC1984 I'm trying to replicate it in a consistent and reproducible way. I'll update this thread if I reach a working functional test. Currently in production is happened one time after months of daily usage.

ftaiolivista avatar May 26 '21 06:05 ftaiolivista

Tried flooding in test ambient but not able to replicate. But I'm sure the two job start in parallel, found logs with seconds statement:

# Job ID: jkox3lwz226
# Event Title: export-document
# Hostname: clinic
# Date/Time: 2021/05/20 16:18:23 (GMT+0)

# Job ID: jkox3lx2927
# Event Title: export-document
# Hostname: clinic
# Date/Time: 2021/05/20 16:18:23 (GMT+0)

And I'm sure there was only one instance of cronicle running.

I'll make more tests in future

ftaiolivista avatar May 28 '21 07:05 ftaiolivista