elsa-core
elsa-core copied to clipboard
Elsa.Quartz triggers deleted on server restart (3.3.x)
Description
We recently upgraded our project from version 3.2 to 3.3. All existing workflows continue to function without any issues. However, we’ve encountered a problem with newly created workflows that are triggered via cron expressions.
After restarting the application, the new entries are removed from the QRTZ_TRIGGERS table. Interestingly, workflows created under version 3.2 remain intact — only the newly published crons are being deleted.
Steps to Reproduce
- Create a new workflow with CRON
- Restart the server
- Table items removed - CRON jobs not fired anymore
Expected Behavior
Jobs to fire after restart
Actual Behavior
They are not fired, also wiped from DB
Screenshots
Before restart:
After restart:
Environment
3.3.x (all editions) K8S - mcr.microsoft.com/dotnet/aspnet:9.0 container + MS SQL Server
Log Output
None
Troubleshooting Attempts
Downgrading to 3.2 solved the issues.
Additional Context
Related Issues
There is another issue that seems related. We have a workflow that uses an HTTP endpoint trigger and appears to be in a “published” state. However, after a restart, calling the endpoint results in a 404 error. Interestingly, if I click “publish” again, everything starts working as expected.
This leads me to believe that, possibly, the application incorrectly assumes some workflows are not in a published state after startup — even though they are. This could also explain why the cron-based workflows are being deleted on startup.
Hi @endrelovas , would it be possible for you to try and reproduce this issue with the recently released 3.4.0 version? I've not seen this issue during my recent testing with Cron or HTTP Endpoint activities, so chances are the issue might have been resolved.
Hi @endrelovas , would it be possible for you to try and reproduce this issue with the recently released 3.4.0 version? I've not seen this issue during my recent testing with Cron or HTTP Endpoint activities, so chances are the issue might have been resolved.
It is the same with 3.4.0. Frustrating.
I just noticed one more thing - the table gets wiped as soon as I terminate the application. When I restart it the table does not get repopulated.
What is the expected behavior here? Should all data in QRTZ_TRIGGERS be deleted when the application exits and then repopulated on restart? Or is it intended for the data to persist across restarts?
I can’t reproduce this. Are you able to reproduce this with a (simplified) project on your local?
The expected behavior is that the Quartz triggers table remains populated. Only when disabling the Quartz feature would cause the triggers to disappear from what I’ve seen with others.
I can’t reproduce this. Are you able to reproduce this with a (simplified) project on your local?
The expected behavior is that the Quartz triggers table remains populated. Only when disabling the Quartz feature would cause the triggers to disappear from what I’ve seen with others.
Does Application shutdown count as a disable?
This is what we see upon shutdown:
[11:56:49 INF] HTTP POST /_blazor/negotiate responded 200 in 0.3360 ms
[11:56:49 INF] Scheduler QuartzScheduler_$_NON_CLUSTERED shutting down.
[11:56:49 INF] Scheduler QuartzScheduler_$_NON_CLUSTERED paused.
[11:56:49 INF] Scheduler QuartzScheduler_$_NON_CLUSTERED Shutdown complete.
[11:56:49 INF] Registering datasource 'default' with db provider: 'Quartz.Impl.AdoJobStore.Common.DbProvider'
[11:56:49 INF] Using object serializer: Quartz.Simpl.JsonObjectSerializer, Quartz.Serialization.Json
[11:56:49 INF] Initialized Scheduler Signaller of type: Quartz.Core.SchedulerSignalerImpl
[11:56:49 INF] Quartz Scheduler created
[11:56:49 INF] JobFactory set to: Quartz.Simpl.MicrosoftDependencyInjectionJobFactory
[11:56:49 INF] Detected usage of SqlServerDelegate - defaulting 'selectWithLockSQL' to 'SELECT * FROM [quartz].qrtz_LOCKS WITH (UPDLOCK,ROWLOCK) WHERE SCHED_NAME = @schedulerName AND LOCK_NAME = @lockName'.
[11:56:49 INF] Using db table-based data access locking (synchronization).
[11:56:49 INF] Successfully validated presence of 10 schema objects
[11:56:49 INF] JobStoreTX initialized.
[11:56:49 INF] Quartz Scheduler 3.14.0.0 - 'QuartzScheduler' with instanceId 'NON_CLUSTERED' initialized
[11:56:49 INF] Using thread pool 'Quartz.Simpl.DefaultThreadPool', size: 10
[11:56:49 INF] Using job store 'Quartz.Impl.AdoJobStore.JobStoreTX', supports persistence: True, clustered: True
[11:56:49 INF] Adding 0 jobs, 0 triggers.
Curiously quartz get reinitialized.
Furthermore I increased the loglevel to debug. This is what happens when I gracefully stop ELSA (hitting ctrl+C in running console):
[06:02:18 DBG] Lock 'TRIGGER_ACCESS' is desired by: 4ad7a24d-04b5-493e-8159-8260ea760e47
[06:02:18 DBG] Prepared SQL: SELECT * FROM [quartz].qrtz_LOCKS WITH (UPDLOCK,ROWLOCK) WHERE SCHED_NAME = @schedulerName AND LOCK_NAME = @lockName
[06:02:18 DBG] Lock 'TRIGGER_ACCESS' is being obtained: 4ad7a24d-04b5-493e-8159-8260ea760e47
[06:02:18 DBG] Lock 'TRIGGER_ACCESS' given to: 4ad7a24d-04b5-493e-8159-8260ea760e47
[06:02:18 DBG] Prepared SQL: SELECT J.JOB_NAME, J.JOB_GROUP, J.IS_DURABLE, J.JOB_CLASS_NAME, J.REQUESTS_RECOVERY FROM [quartz].qrtz_TRIGGERS T, [quartz].qrtz_JOB_DETAILS J WHERE T.SCHED_NAME = @schedulerName AND T.SCHED_NAME = J.SCHED_NAME AND T.TRIGGER_NAME = @triggerName AND T.TRIGGER_GROUP = @triggerGroup AND T.JOB_NAME = J.JOB_NAME AND T.JOB_GROUP = J.JOB_GROUP
[06:02:18 DBG] Prepared SQL: DELETE FROM [quartz].qrtz_SIMPLE_TRIGGERS WHERE SCHED_NAME = @schedulerName AND TRIGGER_NAME = @triggerName AND TRIGGER_GROUP = @triggerGroup
[06:02:18 DBG] Prepared SQL: DELETE FROM [quartz].qrtz_CRON_TRIGGERS WHERE SCHED_NAME = @schedulerName AND TRIGGER_NAME = @triggerName AND TRIGGER_GROUP = @triggerGroup
[06:02:18 DBG] Prepared SQL: DELETE FROM [quartz].qrtz_TRIGGERS WHERE SCHED_NAME = @schedulerName AND TRIGGER_NAME = @triggerName AND TRIGGER_GROUP = @triggerGroup
[06:02:18 DBG] Lock 'TRIGGER_ACCESS' returned by: 4ad7a24d-04b5-493e-8159-8260ea760e47
It is clear that quartz deletes items.
I see the issue now. When multitenancy was added, the Tenant Deactivating event will clear the triggers, because when a tenant is unregistered at runtime, we want to clear the triggers. The problem is that the event is also called when shutting down the app, causing the triggers to be removed and then re-added when the application starts (at least this is what happens in 3.5).
We need to differentiate between tenant's being unregistered and the system shutting down. Only when a tenant is unregistered should its triggers be removed.