[Bug]: Scheduled Job Disappearing
Version
5.46.1
Platform
NodeJS
What happened?
I am running Bull MQ 5 in multi-worker-server setup.
When a (single) Scheduled Task (dedicated queue) is executed the task disappears. This only seems to happen on a new server which is deployed. I am unable to isolate the difference.
See Redis Log.
TIA
How to reproduce.
No response
Relevant log output
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
The information you provide for us is too little, there is not action we can take based on this.
Hello manast,
It looks like the attached log went missing (my fault). I will find and reattach.
hi @digipigeon wondering if you are using "every" option
This might be the same issue as #3272.
Sorry for the delay, I got side tracked, I will try to get a log today.
I am not sure if this is the same issue, it does not seem the same at present.
Hello,
The job I am having problems with is called cron-alert, so I have ran the following to obtain all the commands related to this namespace
redis-cli MONITOR | grep alert
I then run my simple application, which creates a repeatable job to run every minute and also subscribes to that queue.
I successfully see that the repeatable job has been created in arena
I then see it disappear when the job starts.
After the job finishes there is still no (next) repeatable job created. So the repeatable job is effectively fully removed and it never triggers again.
Please let me know if you need any more information or you have anotherwise prefered way for me to log what is going on.
@digipigeon any chance you could provide a snippet that reproduces this issue? that will be very helpful so that we can also provide a solution that addresses specifically your problem.
Certainly, please remember that this works with no problems on our older (exact same code and libraries) server.
/*
TOML Config
[module.alert]
path = "./minute/alert"
cron = "* * * * *"
controller = "minute"
main.js
*/
const config = require('toml').parse('' + require('fs').readFileSync('config.toml'));
const { Worker, Queue } = require('bullmq');
const path = require('path');
const connection = JSON.parse(JSON.stringify(config.bull.redis));
const bullOpts = {connection, prefix: 'bullmq5'}
Object.entries(config.module)
.forEach(async ([mod, modConfig]) => {
console.log('Starting', mod);
const processorFile = path.join(__dirname, modConfig.path) + '.js';
const q = new Queue('cron-' + mod, bullOpts);
const worker = new Worker('cron-' + mod, processorFile, bullOpts);
worker.on('completed', function(job) { // result
// Do some external stuff
});
await q.add('cron-' + mod, {}, {repeat: {cron: modConfig.cron}, attempts: 1});
q.clean(86400 * 7);
q.clean(86400 * 7, 'failed');
setInterval(() => {
q.clean(86400 * 7);
q.clean(86400 * 7, 'failed');
}, 3600 * 1000);
});
This calls an external file which has various processing in it, however nothing that touches bull
alert.js
module.exports = async function(job) {
job.log('Starting Alert');
// Iterator removed for brevity
try {
// Do Stuff
job.log(`${type} to ${body.to} sent`);
if (job && job.updateProgress && total) job.updateProgress(parseInt(idx / total.length * 100))
} catch (e) {
job.log(e); // Re-looking at this, I probably should be changing this to e.message
}
// End of iterator
job.log('Complete');
}
I have stripped out the other bits of logic and only included what happens with anything related to bullmq.
I just re-tested by replacing job.log(e); with job.log(e.message); and it made no difference.
Additionally just to mention, we run 27 of these scheduled jobs, all have different logic inside them different repetitions, they are created and ran from the same code block as above, all present the same failure modality on the new server.
job.log is an asynchronous call, so you must await it: await job.log('Starting Alert');
Hi @manast sorry I just wanted to confirm something...
Most of the time I am building applications for performance and waiting for logging to write is not something that I am really interested in doing.
Aside from not knowing if the log write fails or not, when you say that I must wait for it. Is there some other good reason for this?
Thanks in advance.
Hello,
I have just noticed the change to 5.53.1 for the closure of https://github.com/taskforcesh/bullmq/pull/3269. I have just updated and ran a test and I am sorry to confirm that this does NOT fix the bug.
Aside from not knowing if the log write fails or not, when you say that I must wait for it. Is there some other good reason for this?
hi @digipigeon. Log method is an asynchronous method. Also it validated that a job exits before trying to add a log record, if job does not exist at the time that it's being executed, it'll throw an error
could you also try to use upsertJobScheduler methods and let us know how it goes https://docs.bullmq.io/guide/job-schedulers
Hello, this does not appear to help the situation, the same problem happens.
hi @digipigeon, one fix was deployed few days ago in v5.53.1, pls try it and let us know
Hello,
I have just noticed the change to 5.53.1 for the closure of #3269. I have just updated and ran a test and I am sorry to confirm that this does NOT fix the bug.
@roggervalf
Hello, I have just noticed the change to 5.53.1 for the closure of #3269. I have just updated and ran a test and I am sorry to confirm that this does NOT fix the bug.
Which test is not working? The ones we managed to reproduce are all in the test suite and passing. If you have a different case that is not passing, please provide us with the code.
Hi @manast sorry I just wanted to confirm something...
Most of the time I am building applications for performance and waiting for logging to write is not something that I am really interested in doing.
Aside from not knowing if the log write fails or not, when you say that I must wait for it. Is there some other good reason for this?
Thanks in advance.
As a rule of thumb for writing robust code, please await asynchronous calls, so that you avoid unexpected results down the road. If you are not awaiting you have no idea when that call is going to be processed, potentially leading to undesired side effects.
closing as we do not have any reproducible test code nor any feedback to our latest replies.