spring-amqp icon indicating copy to clipboard operation
spring-amqp copied to clipboard

Trace information is missing when Exception is thrown from RabbitListener methods

Open jonatan-ivanov opened this issue 4 years ago • 11 comments

Affects Version(s): All currently supported versions (including 2.3.4)


Bug report

The issue was originally reported for Spring Cloud Sleuth but it seems it is not a Sleuth issue (and I can't transfer issues across orgs) so I'm opening this one to track it at the right place.

The original issue is this: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660 opened by @goober It contains the necessary details to understand what is going on, a sample project that reproduces the issue, some investigation details, a workaround, and a proposed fix (which is a breaking change):

Describe the bug When an exception is thrown from a method annotated with @RabbitListener all trace information is lost when it reaches the ConditionalRejectingErrorHandler

Sample I have created a minimal application that reproduces the issue with a little more context that can be found here: https://github.com/goober/spring-sleuth-rabbit-bug

The original sample project uses spring-boot 2.3.0.RELEASE that brings in spring-amqp 2.2.6.RELEASE but the issue must be present in the latest spring-amqp too (2.3.4).

Sample project: https://github.com/goober/spring-sleuth-rabbit-bug Investigation details: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-743582707 Proposed workaround: https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-749136525 Possible fix (breaking change): https://github.com/spring-projects/spring-amqp/pull/1287

jonatan-ivanov avatar Feb 09 '21 21:02 jonatan-ivanov

Is the work around I provided in https://github.com/spring-cloud/spring-cloud-sleuth/issues/1660#issuecomment-749148062 not satisfactory for some reason?

garyrussell avatar Feb 09 '21 21:02 garyrussell

@garyrussell You mean not fixing this but asking the users to apply the workaround? I think the workaround is ok until this gets fixed but I think this should eventually be fixed so that the users get instrumentation out of the box (not necessarily through instrumentation using a proxy).

jonatan-ivanov avatar Feb 09 '21 21:02 jonatan-ivanov

See my comment in your PR: https://github.com/spring-projects/spring-amqp/pull/1287#issuecomment-748203972

We can't apply your change or consider as a bug, since it works like that for any existing @...Listener solution. So, we need to come up with a consolidated decision to make it alike everywhere instead of raising duplicated issues all around.

artembilan avatar Feb 09 '21 22:02 artembilan

As we discussed on the PR, we can't make a breaking change in a patch release; we can schedule this for 2.4 (but there is currently no schedule for that at this time).

We have the open PR, but I will set the milestone for this to 2.4.

garyrussell avatar Feb 09 '21 22:02 garyrussell

OK, sorry. I see you have created a similar issue for Spring Kafka: https://github.com/spring-projects/spring-kafka/issues/1704. So, yeah, you probably just pointing for the common problem everywhere. This is not a new issue, but a placeholder to have really that consolidated report.

Perhaps we also need to create a similar one for a @JmsListener in Spring Framework.

artembilan avatar Feb 09 '21 22:02 artembilan

I'm not pushing for that draft PR, I'm manually transferring issues to the right place since GH can't do it across orgs, here are the others:

  • @KafkaListener: https://github.com/spring-projects/spring-kafka/issues/1704
  • @JmsListener: https://github.com/spring-projects/spring-framework/issues/26532
  • @Scheduled: https://github.com/spring-projects/spring-framework/issues/26533

jonatan-ivanov avatar Feb 09 '21 22:02 jonatan-ivanov

I think I would prefer to add a new ListenerInterceptor interface which is invoked before calling the listener and after error handling.

We have a similar existing interceptor in spring-kafka (but it needs enhancing to support Sleuth).

This would have the benefit of (a) not being a breaking change and (b) can be back-ported to earlier supported versions.

It would, of course, need a change in sleuth to implement the new interceptor instead of adding the advice.

garyrussell avatar Aug 11 '21 18:08 garyrussell

Probably under the same category, tracing information is missing when there is a slow consumer & the response is received on client side after the configured timeout:

org.springframework.amqp.rabbit.support.ListenerExecutionFailedException: Listener threw exception
	at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.wrapToListenerExecutionFailedExceptionIfNeeded(AbstractMessageListenerContainer.java:1779)
	at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:1669)
	at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.actualInvokeListener(AbstractMessageListenerContainer.java:1584)
	at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:1572)
	at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:1563)
	at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:1507)
	at org.springframework.amqp.rabbit.listener.DirectMessageListenerContainer$SimpleConsumer.callExecuteListener(DirectMessageListenerContainer.java:1107)
	at org.springframework.amqp.rabbit.listener.DirectMessageListenerContainer$SimpleConsumer.handleDelivery(DirectMessageListenerContainer.java:1067)
	at com.rabbitmq.client.impl.ConsumerDispatcher$5.run(ConsumerDispatcher.java:149)
	at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:104)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.springframework.amqp.AmqpRejectAndDontRequeueException: Reply received after timeout
	at org.springframework.amqp.rabbit.core.RabbitTemplate.onMessage(RabbitTemplate.java:2669)
	at org.springframework.amqp.rabbit.listener.DirectReplyToMessageListenerContainer.lambda$setMessageListener$0(DirectReplyToMessageListenerContainer.java:90)
	at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:1665)
	... 11 common frames omitted

The equivalent issue reported in spring-cloud-sleuth is 1825.

ionel-sirbu-crunch avatar Jun 07 '22 15:06 ionel-sirbu-crunch

I don't believe it is related - the reply comes back on a different thread (dedicated for replies) so there is no tracing context over there; only on the thread that timed out.

garyrussell avatar Jun 07 '22 18:06 garyrussell

I don't believe it is related - the reply comes back on a different thread (dedicated for replies) so there is no tracing context over there; only on the thread that timed out.

Is that how it works? Apologies, I debugged this a very long time ago, so my memory of it is a bit rusty, but I guess it makes sense. If the X-B3 headers were picked up from the request on listener/consumer side & then propagated back to the client via the response Message, wouldn't that ensure proper tracing? I assume the trace/span info is stored in thread-locals, so they can be updated when the response is unwrapped, no matter which thread gets to process that. Is that something that could be implemented somehow?

ionel-sirbu-crunch avatar Jun 08 '22 08:06 ionel-sirbu-crunch

Right; yes, I was just thinking about the thread locals on the requestor thread; of course, the trace headers should be present in the reply and stored for the reply thread, but the template rejects the reply and by the time we hit the error handler, the headers are gone.

Sorry for the noise.

garyrussell avatar Jun 08 '22 14:06 garyrussell

@garyrussell ,

isn't this fixed now via: https://github.com/spring-projects/spring-amqp/issues/1444?

artembilan avatar Oct 10 '22 16:10 artembilan