SQS stopped polling.
Describe the bug The polling of SQS messages for various queues is working fine but the SQS stopped polling twice and all of the SQS messages started to add up in queue. I restarted the server and the queues started to be polled and process instantly. After a month same thing happened and SQS stopped polling and messages started to add up in queues, restarted the server and SQS began to poll and process queues. I've been using several SQS for a long time but this has happened twice in a span of 2 months on production server. Also no error was triggered on events using 'error' and 'processing_error'.
Version of sqs-consumer is "^5.4.0". Version of aws-sdk": "^2.585.0
I've seen a similar issue being reported previously but that was closed with a fix. https://github.com/bbc/sqs-consumer/issues/130 Any idea what could possibly be the reason ?
To Reproduce
Expected behaviour Do not expect SQS to stop polling. Also why did the server restart solved the issue and SQS started to poll again which was in a halt state.
screenshots

Additional context I'm using node.js (v10.19.0) deployed on docker.
We are also facing the same.
sqs : 5.4.0 aws-sdk : 2.611.0 node version : 10
Same issue here.
sqs : 5.4.0 aws-sdk : 2.688.0 node version : 12
Also encountering this issue, from time to time.
"sqs-consumer": "^5.4.0", "aws-sdk": "^2.658.0", node version: 12
Also posting a picture from Sentry that caught this error.

We just hit this this morning, sqs-consumer 5.4.0, node 10, aws-sdk 2.708.0. No errors emitted.
We also encountered this today:
- sqs-consumer 5.4.0
- Node 14.7.0 (using the node:14.7.0-buster Docker image)
- aws-sdk 2.743.0
No errors were emitted.
same issue
"sqs-consumer": "^5.4.0" "aws-sdk": "^2.574.0",
node 12
Can you post your consumer options? There may be a common configuration setting when this issue is occurring (such as keeping the http connection open, or not providing a handler timeout).
@achallett
- batchSize: 10,
- custom sqs client with https keepAlive client is used (https://www.npmjs.com/package/agentkeepalive)
- visibilityTimeout: 60
- multiple instances(10) of sqs-consumer running in the same nodejs process
Can you post your consumer options? There may be a common configuration setting when this issue is occurring (such as keeping the http connection open, or not providing a handler timeout).
@achallett here you go
AWS config ----> Type: Standard, Encryption: Disabled, DLQ: Disabled, Max message size: 256kB, Messsage retention period: 4 days, Default visibility timeout - 30secs,
We listen to different SQS queues like this as well in the same app ` import AWS from 'aws-sdk'; import { Consumer } from 'sqs-consumer'; import { awsConfig } from '../../common/config/config'; import { Logger } from '../../common/config/logger';
const logger = Logger.getInstance(module);
const sqs = () => { AWS.config.update({ region: awsConfig.region, }); return new AWS.SQS(); };
/**
- Consumer listening to messages from AWS SQS.
*/ const sqsAudit = Consumer.create({ queueUrl: awsConfig.sqsAuditUrl, sqs: sqs(), handleMessage, // defined in a different file messageAttributeNames: ['event'], });
sqsAudit.on('error', (err: Error) => {
logger.error(Error while interacting with queue: ${err});
});
sqsAudit.on('processing_error', (err: Error) => {
logger.error(Error while processing message: ${err.message});
});
sqsAudit.on('timeout_error', (err: Error) => {
logger.error(Handle message timed out: ${err});
});
export default sqsAuditLogApp; `
Could you please share an update on this issue @achallett ?
Could be an upstream issue? There are similar reports in the AWS repo with no clear resolution, e.g. https://github.com/aws/aws-sdk-js-v3/issues/6015
If so, should we implement our own backup timeout for receiveMessage?
Same issue in my case
public startConsumer = (queueUrl: string) => {
console.log(`Starting consumer, ${queueUrl}`);
Consumer.create({
queueUrl,
handleMessage: (msg) =>
this.consumer.handle(JSON.parse(<string>msg.Body)).catch((error) => {
console.log('ConsumerRunner error:', error);
throw error;
}),
}).start();
};
Here's a possible workaround. Do you foresee any problems with this approach?
// Check every 30 seconds to see if poller is still running. If not, then re-start it.
setInterval(() => {
const isRunning = app.isRunning
console.info(`sqs poller isRunning? : ${isRunning}`)
if (!isRunning) {
console.warn('sqs poller is not running. Re-starting now.')
app.start()
}
}, 30000)
Fixing problems by continuous restart its not a good approach :) In my opinion its a way better to find and solve source of problem :)
Hi,
I am facing same scenario in my product which is going to go live in a week. Is there any good work around for this. Don't have the time to remove this now. @AG-Teammate @pawelszczerbicki @ariesmcrae @KencyK @Tobska @mfrobben
Does this help? How to try this or has anyone tried this? https://stackoverflow.com/questions/37111431/amazon-sqs-with-aws-sdk-receivemessage-stall
Facing this issue as well. It's freezing in my production environment.
I removed await key word from the function called inside SQS consumer and handle retries myself and since then it haven't occurred.
I've a same issue, I use NestJS with the @ssut/nestjs-sqs library that depends on sqs-consumer, after a few days, the consumer is not able to receive any more messages from aws (the microservice containing the consumer is still working properly), but when the kubernetes pod is restarted, the consumer in my app starts polling and processing messages from queues again.
node version: 12.19
"@nestjs/core": "^7.6.15",
"aws-sdk": "^2.806.0",
"@ssut/nestjs-sqs": "^1.0.0",
-> ssut/nestjs-sqs dependencies
"aws-sdk": "^2.728.0",
"sqs-consumer": "^5.4.0",
"sqs-producer": "^2.0.2",
Register SqsModule in app.module.ts
SqsModule.registerAsync({
imports: [AppConfigModule],
useFactory: async (configService: ConfigService) => {
const sqs = new AWS.SQS({
accessKeyId: configService.get(Configuration.AWS_ACCESS_KEY_ID),
secretAccessKey: configService.get(Configuration.AWS_SECRET_ACCESS_KEY),
region: configService.get(Configuration.AWS_REGION),
});
return {
consumers: [
{
name: `${configService.get(Configuration.QUEUE_NAME)}`,
queueUrl: `${configService.get(Configuration.QUEUE_URL)}`,
sqs
},
],
producers: [],
};
},
inject: [AppConfigService],
}),
Consumer Service
@Injectable()
export class SQSMessageHandler {
@SqsMessageHandler(`${process.env.QUEUE_NAME}`)
public async handleMessage(message: AWS.SQS.Message) {
this.logger.debug(`Incomming message: ${message.MessageId} `, SQSMessageHandler.name);
...
}
}
I removed await key word from the function called inside SQS consumer and handle retries myself and since then it haven't occurred.
Hi @mashoodrafi006 , can you please elaborate how you handled retries on any error
I wrote a cron command that would bulk update the data between micro-services after hourly interval to patch for the events that were not consumed. @anoop-chauhan
I was facing this issue because the "handleMessage" of the consumer instance is not resolved for a specific poll, and this caused the the consumer to stall. Fix: Resolved "handleMessage" method for all possible cases, and added better error handling.
I was facing this issue because the "handleMessage" of the consumer instance is not resolved for a specific poll, and this caused the the consumer to stall. Fix: Resolved "handleMessage" method for all possible cases, and added better error handling.
Pls can you elaborate more on this. Having same issue for long now
so what is possible workaround?
Has anyone found a solution for this? I am still facing this issue.
I am facing this issue, any solution?
i'm facing the same issue here, any solution?
Facing the same issue as well.
@karthikpawar could you elaborate a bit on the changes you made to handleMessage / what case was causing it to not resolve?
Was it an unresolved Promise? Were you not returning within a function somewhere?
I've a same issue, I use NestJS with the @ssut/nestjs-sqs library that depends on sqs-consumer, after a few days, the consumer is not able to receive any more messages from aws (the microservice containing the consumer is still working properly), but when the kubernetes pod is restarted, the consumer in my app starts polling and processing messages from queues again.
Hi. Is this still an issue? Looking in to nest module working through SQS and still not sure if I should use this package and nest-sqs or should I write some boilerplate with aws-sdk? Thanks for any advise on this in advance
Facing same issue.
I am using @ssut/nestjs-sqs library which uses internally this one.
Is there any update on this one ?