optscale icon indicating copy to clipboard operation
optscale copied to clipboard

Frequent consumer timeouts and delivery acknowledgement errors on RabbitMQ queues

Open ffaraone opened this issue 2 months ago • 2 comments

Hi,

I’m observing frequent consumer timeout warnings and channel errors in the RabbitMQ logs for two queues used by the application. The errors suggest that message acknowledgments are not being received within the configured timeout window.

This behavior may result in:

  • Messages being requeued or redelivered.
  • Increased queue load and slower processing.
  • Potential message duplication.

Log examples:

2025-10-31 09:57:11.836278+00:00 [warning] <0.3297244.0> Consumer 'None62' on channel 1 and queue 'report-imports' in vhost '/' has timed out waiting for a consumer acknowledgement of a delivery with delivery tag = 33. Timeout used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more
2025-10-31 09:57:11.836551+00:00 [error] <0.3297244.0> Channel error on connection <0.3297238.0> (10.250.6.59:47024 -> 10.250.2.71:5672, vhost: '/', user: 'optscale'), channel 1:
2025-10-31 09:57:11.836551+00:00 [error] <0.3297244.0> operation none caused a channel exception precondition_failed: delivery acknowledgement on channel 1 timed out. Timeout value used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more

Environment:

Expected behavior:

Consumers should acknowledge message deliveries within the timeout period without triggering channel exceptions.

Questions:

  1. What conditions can cause delivery acknowledgment timeouts like this?
  2. Are there recommended settings or best practices to avoid consumer timeouts?

rabbitmq_warnings_and_errors.log

ffaraone avatar Oct 31 '25 10:10 ffaraone

Hi @ffaraone, we updated RabbitMQ version to 4.1.4 in release 2025102901-public, please check that this error does not occur(or still appears) on this version

stanfra avatar Nov 03 '25 13:11 stanfra

Also seeing this in the diworker running 2025102901-public.

That timeout number is a default coming from the rabbitmq side, for now I've adjusted that manually logging into the rabbitmq pod and adjusting.

I'm seeing this during a 200+ account CUR load

  File "/src/diworker/.venv/lib/python3.12/site-packages/amqp/method_framing.py", line 53, in on_frame
    callback(channel, method_sig, buf, None)
  File "/src/diworker/.venv/lib/python3.12/site-packages/amqp/connection.py", line 538, in on_inbound_method
    return self.channels[channel_id].dispatch_method(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/src/diworker/.venv/lib/python3.12/site-packages/amqp/abstract_channel.py", line 156, in dispatch_method
    listener(*args)
  File "/src/diworker/.venv/lib/python3.12/site-packages/amqp/channel.py", line 293, in _on_close
    raise error_for_code(
amqp.exceptions.PreconditionFailed: (0, 0): (406) PRECONDITION_FAILED - delivery acknowledgement on channel 1 timed out. Timeout value used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more
[MainThread] INFO: Connected to amqp://optscale:**@rabbitmq:5672//
[ThreadPoolExecutor-0_2] INFO: Starting processing for task: {'report_import_id': '032444c0-f93b-46ba-afbd-b71cf4883b98'}, purpose import
[ThreadPoolExecutor-0_2] INFO: Started import for 031965c4-e40a-4a0d-922f-32b8ad900f3b

Basically stops the load dead, then it starts working on a different UID.

phish32786 avatar Nov 07 '25 20:11 phish32786

Hello @phish32786 We will investigate your request. I'll you know as soon as I get any conclusion.

VR-Hystax avatar Nov 27 '25 05:11 VR-Hystax