jasmin icon indicating copy to clipboard operation
jasmin copied to clipboard

AMQP defer.AlreadyCalledError

Open Integration-IT opened this issue 2 years ago • 8 comments

Hi,

AMQP raise exception probably due to multiple firings. I guess some where in your code, you are explicitly calling the callback of deferred. This is also happening multiple times. A deferred callback can be fired only once which indicates the completion of a long awaited task resulting in a error or positive result. Twisted has a mechanism to throw the exception.

This exception raises a total break between dlrd and rabbitmq. In fact, the broker is down.

amqp-client.log
ERROR    1081538 Channel open failed: [Failure instance: Traceback: <class 'twisted.internet.defer.AlreadyCalledError'>: 
/usr/lib/python3/dist-packages/twisted/internet/defer.py:1750:gotResult
/usr/lib/python3/dist-packages/twisted/internet/defer.py:1737:_inlineCallbacks
/usr/lib/python3/dist-packages/twisted/internet/defer.py:661:callback
/usr/lib/python3/dist-packages/twisted/internet/defer.py:763:_startRunCallbacks
--- <exception caught here> ---
/usr/lib/python3/dist-packages/twisted/internet/defer.py:857:_runCallbacks
/usr/local/lib/python3.10/dist-packages/jasmin/queues/factory.py:181:_channel_open
/usr/lib/python3/dist-packages/twisted/internet/defer.py:661:callback
/usr/lib/python3/dist-packages/twisted/internet/defer.py:756:_startRunCallbacks
]
To replicate
from twisted.internet.defer import Deferred
def func(x): print x
d = Deferred()
d.addCallbacks(func, func)
d.callback('First fire')
d.callback('Second fire')
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 661, in callback
    self._startRunCallbacks(result)
  File "/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 756, in _startRunCallbacks
    raise AlreadyCalledError
twisted.internet.defer.AlreadyCalledError

logs and manual test have the same error.

0.10.12, 0.10.13 -> Error identified.

Integration-IT avatar Jan 20 '23 21:01 Integration-IT

Hi, I'm having the same issue here and it results in our Jasmin install to completely stop routing messages. I recently upgraded from a very old version to the latest (0.10.13) and I'm getting the very same error as the OP:

2023-03-13 23:01:21 ERROR    568 Channel open failed: [Failure instance: Traceback: <class 'twisted.internet.defer.AlreadyCalledError'>: 
/var/lib/jasmin/package/env/lib/python3.8/site-packages/twisted/internet/defer.py:1750:gotResult
/var/lib/jasmin/package/env/lib/python3.8/site-packages/twisted/internet/defer.py:1737:_inlineCallbacks
/var/lib/jasmin/package/env/lib/python3.8/site-packages/twisted/internet/defer.py:661:callback
/var/lib/jasmin/package/env/lib/python3.8/site-packages/twisted/internet/defer.py:763:_startRunCallbacks
--- <exception caught here> ---
/var/lib/jasmin/package/env/lib/python3.8/site-packages/twisted/internet/defer.py:857:_runCallbacks
/var/lib/jasmin/package/env/lib/python3.8/site-packages/jasmin/queues/factory.py:181:_channel_open
/var/lib/jasmin/package/env/lib/python3.8/site-packages/twisted/internet/defer.py:661:callback
/var/lib/jasmin/package/env/lib/python3.8/site-packages/twisted/internet/defer.py:756:_startRunCallbacks
]

idevsoftware avatar Mar 14 '23 14:03 idevsoftware

Hello @idevsoftware Can you please share clear steps to reproduce the issue ?

farirat avatar Aug 29 '23 22:08 farirat

Hi @farirat, thanks for getting back to me. Unfortunately we don't have a clear path to reproduce the issue - we just leave Jasmin running and doing its business as usual until the problem described above arises. I'm not sure under what conditions the error is triggered, but if you could point me on how to collect debug data I'd be happy to provide with detailed log files the next time this occurs.

idevsoftware avatar Aug 31 '23 13:08 idevsoftware

@farirat Hi! I'm have a similar problem on docker image jookies/jasmin:0.10.12. Appreciate any assistance.

blankstatic avatar Aug 31 '23 14:08 blankstatic

Is it related to one of these:

  • Running for x hours ?
  • Handling massive traffic for a long period ?
  • Is the hardware undersized ? rabbitmq is freezing consequently ?

Can you let a tcpdump snooping rabbitmq traffic with jasmin and save a pcap file, share it once the problem arise, please consider the confidentiality of the data you'll share in the pcap file.

farirat avatar Aug 31 '23 23:08 farirat

Hi @farirat, to answer your questions:

  • this is a production system and as such Jasmin has been running for months - our previous install was literally working for years until we upgraded to the newest version and stumbled upon this issue.
  • I wouldn't call our traffic massive - we have 5 SMPP connectors that handle about ~20k messages per day each max (on busy days).
  • I don't think hardware is undersized either, like I said we've been running this setup for years before encountering the problem. I haven't seen any evidence of rabbitmq freezing or the VM handling sustained high CPU loads.

Unfortunately I can't grab a tcpdump because the SMPP connections handled by Jasmin go through IPSec tunnels that are established by the same machine, and tcpdump can't inspect traffic going through the tunnels.

idevsoftware avatar Sep 01 '23 05:09 idevsoftware

Hello,

I dont need smpp traffic, only amqp traffic (between jasmin and rabbit)

farirat avatar Sep 01 '23 07:09 farirat

Thanks @farirat, I was able to capture traffic when the issue happened. What's the best way to privately send you a .pcap file?

idevsoftware avatar Sep 05 '23 02:09 idevsoftware