spring-amqp
spring-amqp copied to clipboard
Trace information is missing when Exception is thrown from RabbitListener methods
Affects Version(s): All currently supported versions (including 2.3.4
)
Bug report
The issue was originally reported for Spring Cloud Sleuth but it seems it is not a Sleuth issue (and I can't transfer issues across orgs) so I'm opening this one to track it at the right place.
The original issue is this: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660 opened by @goober It contains the necessary details to understand what is going on, a sample project that reproduces the issue, some investigation details, a workaround, and a proposed fix (which is a breaking change):
Describe the bug When an exception is thrown from a method annotated with
@RabbitListener
all trace information is lost when it reaches theConditionalRejectingErrorHandler
Sample I have created a minimal application that reproduces the issue with a little more context that can be found here: https://github.com/goober/spring-sleuth-rabbit-bug
The original sample project uses spring-boot
2.3.0.RELEASE
that brings in spring-amqp
2.2.6.RELEASE
but the issue must be present in the latest spring-amqp
too (2.3.4
).
Sample project: https://github.com/goober/spring-sleuth-rabbit-bug Investigation details: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-743582707 Proposed workaround: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-749136525 Possible fix (breaking change): https://github.com/spring-projects/spring-amqp/pull/1287
Is the work around I provided in https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-749148062 not satisfactory for some reason?
@garyrussell You mean not fixing this but asking the users to apply the workaround? I think the workaround is ok until this gets fixed but I think this should eventually be fixed so that the users get instrumentation out of the box (not necessarily through instrumentation using a proxy).
See my comment in your PR: https://github.com/spring-projects/spring-amqp/pull/1287#issuecomment-748203972
We can't apply your change or consider as a bug, since it works like that for any existing @...Listener
solution.
So, we need to come up with a consolidated decision to make it alike everywhere instead of raising duplicated issues all around.
As we discussed on the PR, we can't make a breaking change in a patch release; we can schedule this for 2.4 (but there is currently no schedule for that at this time).
We have the open PR, but I will set the milestone for this to 2.4.
OK, sorry. I see you have created a similar issue for Spring Kafka: https://github.com/spring-projects/spring-kafka/issues/1704. So, yeah, you probably just pointing for the common problem everywhere. This is not a new issue, but a placeholder to have really that consolidated report.
Perhaps we also need to create a similar one for a @JmsListener
in Spring Framework.
I'm not pushing for that draft PR, I'm manually transferring issues to the right place since GH can't do it across orgs, here are the others:
-
@KafkaListener
: https://github.com/spring-projects/spring-kafka/issues/1704 -
@JmsListener
: https://github.com/spring-projects/spring-framework/issues/26532 -
@Scheduled
: https://github.com/spring-projects/spring-framework/issues/26533
I think I would prefer to add a new ListenerInterceptor
interface which is invoked before calling the listener and after error handling.
We have a similar existing interceptor in spring-kafka (but it needs enhancing to support Sleuth).
This would have the benefit of (a) not being a breaking change and (b) can be back-ported to earlier supported versions.
It would, of course, need a change in sleuth to implement the new interceptor instead of adding the advice.
Probably under the same category, tracing information is missing when there is a slow consumer & the response is received on client side after the configured timeout:
org.springframework.amqp.rabbit.support.ListenerExecutionFailedException: Listener threw exception
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.wrapToListenerExecutionFailedExceptionIfNeeded(AbstractMessageListenerContainer.java:1779)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:1669)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.actualInvokeListener(AbstractMessageListenerContainer.java:1584)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:1572)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:1563)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:1507)
at org.springframework.amqp.rabbit.listener.DirectMessageListenerContainer$SimpleConsumer.callExecuteListener(DirectMessageListenerContainer.java:1107)
at org.springframework.amqp.rabbit.listener.DirectMessageListenerContainer$SimpleConsumer.handleDelivery(DirectMessageListenerContainer.java:1067)
at com.rabbitmq.client.impl.ConsumerDispatcher$5.run(ConsumerDispatcher.java:149)
at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:104)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.springframework.amqp.AmqpRejectAndDontRequeueException: Reply received after timeout
at org.springframework.amqp.rabbit.core.RabbitTemplate.onMessage(RabbitTemplate.java:2669)
at org.springframework.amqp.rabbit.listener.DirectReplyToMessageListenerContainer.lambda$setMessageListener$0(DirectReplyToMessageListenerContainer.java:90)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:1665)
... 11 common frames omitted
The equivalent issue reported in spring-cloud-sleuth is 1825.
I don't believe it is related - the reply comes back on a different thread (dedicated for replies) so there is no tracing context over there; only on the thread that timed out.
I don't believe it is related - the reply comes back on a different thread (dedicated for replies) so there is no tracing context over there; only on the thread that timed out.
Is that how it works? Apologies, I debugged this a very long time ago, so my memory of it is a bit rusty, but I guess it makes sense.
If the X-B3
headers were picked up from the request on listener/consumer side & then propagated back to the client via the response Message
, wouldn't that ensure proper tracing? I assume the trace/span info is stored in thread-locals, so they can be updated when the response is unwrapped, no matter which thread gets to process that. Is that something that could be implemented somehow?
Right; yes, I was just thinking about the thread locals on the requestor thread; of course, the trace headers should be present in the reply and stored for the reply thread, but the template rejects the reply and by the time we hit the error handler, the headers are gone.
Sorry for the noise.
@garyrussell ,
isn't this fixed now via: https://github.com/spring-projects/spring-amqp/issues/1444?