ApplicationInsights-Java icon indicating copy to clipboard operation
ApplicationInsights-Java copied to clipboard

Dependency tree not available with Azure Blob + Queue storage + ServiceBus + MongoDB

Open abhikt48 opened this issue 1 year ago • 20 comments

We are using applicationinsights-agent:3.4.10 with MuleSooft Application. We tested with the below flow (HTTP+Mongo+MySql+HTTP+JMS+Logger) and we can see the dependency tree (End-To-End transaction details) in App Insight, which is working fine.

image

But when we tested with ServiceBus(JMS+AMQP) + Blob + Queue Storage + Mongo DB storage + ServiceBus, then we can't see End-To-End transaction details.

image

Used Library Version

  1. Blob - azure-storage-blob-12.20.1
  2. Queue - azure-storage-queue-12.7.0
  3. ServiceBus - qpid-jms-client-0.57.0
  4. MongoDB - mongo-java-driver-3.12.10

NOTE - All components (ServiceBus, HTTP, Blob, Queue, etc.) are getting called from a separate library where we are not passing any operation ID to link a call from one to another.

Expectation

We want to have a dependency tree that should represent End-To-End call.

Could you please suggest any option to link end-to-end transaction calls? Please let me know for any information.

abhikt48 avatar Mar 06 '23 18:03 abhikt48

hi @abhikt48, can you post the end-to-end transaction details that you see currently in both the non-working and working scenarios? thx!

trask avatar Mar 07 '23 17:03 trask

@abhikt48 can you please submit a sample app to https://github.com/microsoft/ApplicationInsights-Java-Repros where this issue can be reproducible? Thanks.

heyams avatar Mar 07 '23 21:03 heyams

@trask @heyams

Thanks for the quick feedback. Finally. I am able to replicate this issue. The whole issue is related to threading.

Background - We are using Mule ESB where

  1. Some endpoint (ServiceBus consumer) receives message in a separate thread and processing happens in a separate thread, such as the below flow image

  2. Some endpoint works in a single thread for all processing image

We are having an issue in first scenario where multiple threads involves. For replicating this issue, I created a sample class

  1. Run all operations in a single thread - https://github.com/abhikt48/test-applicationinsights-agent/blob/main/src/main/java/com/abhi/test/ai/agent/sbus/storage/TestAiWithMultiTransportsSameThread.java

Refer onMessage(..) method - After receiving the message, I am calling to Blob, Queue, and Mongo DB. In this scenario, dependency tree is working perfectly like below Complete_Transaction

  1. Run operations in multiple threads - https://github.com/abhikt48/test-applicationinsights-agent/blob/main/src/main/java/com/abhi/test/ai/agent/sbus/storage/TestAiAgentWithMultiTransports.java

Refer onMessage(..) method - Where I am creating a new thread to upload Blob, send message on Queue and get a number of documents from Mongo DB, but the dependency tree is not coming together like below.

Expectations - Blob, Queue and Mongo DB transactions should come in a single transaction.

MongoDB Queue

Please refer https://github.com/abhikt48/test-applicationinsights-agent#readme to run sample classes

Please feel free to let me know for any information.

abhikt48 avatar Mar 08 '23 17:03 abhikt48

Hi @heyams ,

Did you get a chance to replicate this issue? Please let me know for any information.

Regards, Abhishek Kumar

abhikt48 avatar Mar 14 '23 16:03 abhikt48

Will try it today. Thank you for reminding me.

heyams avatar Mar 14 '23 18:03 heyams

@abhikt48 i ran into an exception, any ideas?: Exception in thread "main" javax.jms.JMSSecurityException: ****Attempted to perform an unauthorized operation.**** TrackingId:88d4b3b29bb24bcbbe80fd1bf862ad61_G5, SystemTracker:gateway7, Timestamp:2023-03-15T19:22:05 [condition = amqp:unauthorized-access] at org.apache.qpid.jms.provider.exceptions.ProviderSecurityException.toJMSException(ProviderSecurityException.java:41) at org.apache.qpid.jms.provider.exceptions.ProviderSecurityException.toJMSException(ProviderSecurityException.java:27) at org.apache.qpid.jms.exceptions.JmsExceptionSupport.create(JmsExceptionSupport.java:80) at org.apache.qpid.jms.exceptions.JmsExceptionSupport.create(JmsExceptionSupport.java:112) at org.apache.qpid.jms.JmsConnection.createResource(JmsConnection.java:698) at org.apache.qpid.jms.JmsMessageConsumer.<init>(JmsMessageConsumer.java:125) at org.apache.qpid.jms.JmsMessageConsumer.<init>(JmsMessageConsumer.java:82) at org.apache.qpid.jms.JmsSession.createConsumer(JmsSession.java:490) at org.apache.qpid.jms.JmsSession.createConsumer(JmsSession.java:478) at org.apache.qpid.jms.JmsSession.createConsumer(JmsSession.java:470) at **com.abhi.test.ai.agent.sbus.storage.TestAiAgentWithMultiTransports.registerSbusMsgListener(TestAiAgentWithMultiTransports.java:131)** at com.abhi.test.ai.agent.sbus.storage.TestAiAgentWithMultiTransports.main(TestAiAgentWithMultiTransports.java:64) Caused by: org.apache.qpid.jms.provider.exceptions.ProviderSecurityException: Attempted to perform an unauthorized operation. TrackingId:88d4b3b29bb24bcbbe80fd1bf862ad61_G5, SystemTracker:gateway7, Timestamp:2023-03-15T19:22:05 [condition = amqp:unauthorized-access] at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToNonFatalException(AmqpSupport.java:173) at org.apache.qpid.jms.provider.amqp.builders.AmqpResourceBuilder.getOpenAbortExceptionFromRemote(AmqpResourceBuilder.java:299) at org.apache.qpid.jms.provider.amqp.builders.AmqpResourceBuilder.handleClosed(AmqpResourceBuilder.java:185) at org.apache.qpid.jms.provider.amqp.builders.AmqpResourceBuilder.processRemoteClose(AmqpResourceBuilder.java:129) at org.apache.qpid.jms.provider.amqp.AmqpProvider.processUpdates(AmqpProvider.java:985) at org.apache.qpid.jms.provider.amqp.AmqpProvider.onData(AmqpProvider.java:871) at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:563) at org.apache.qpid.jms.transports.netty.NettyTcpTransport$NettyTcpTransportHandler.channelRead0(NettyTcpTransport.java:556) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1534) at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1283) at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1330) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.base/java.lang.Thread.run(Thread.java:834)

heyams avatar Mar 15 '23 19:03 heyams

I believe that I have all the resources setup successfully, blob, service bus, storage queue and mongo db running locally.

username = "RootManageSharedAccessKey", and password is using SBUS shared access primary key. Is this correct?

heyams avatar Mar 15 '23 19:03 heyams

@abhikt48 I'm able to repro

image

heyams avatar Mar 15 '23 23:03 heyams

@heyams Yes correct, you are using right credential for ServiceBus.

Glad, you are able to reproduce this issue.

Looking forward for resolution..

abhikt48 avatar Mar 16 '23 12:03 abhikt48

@abhikt48
SpanData{spanContext=ImmutableSpanContext{traceId=6b43768a1e847ddd66aad79269ffc643, spanId=a84ac109a62bf358, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, resource=Resource{schemaUrl=https://opentelemetry.io/schemas/1.18.0, attributes={host.arch="amd64", host.name="REDACTED", os.description="Windows 11 10.0", os.type="windows", process.command_line="C:\dev\zulu17.40.19-ca-jdk17.0.6-win_x64\bin\java.exe -javaagent:agent/applicationinsights-agent-3.4.11-SNAPSHOT.jar -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2022.2.3\lib\idea_rt.jar=52891:C:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2022.2.3\bin -Dfile.encoding=UTF-8", process.executable.path="C:\dev\zulu17.40.19-ca-jdk17.0.6-win_x64\bin\java.exe", process.pid=31392, process.runtime.description="Azul Systems, Inc. OpenJDK 64-Bit Server VM 17.0.6+10-LTS", process.runtime.name="OpenJDK Runtime Environment", process.runtime.version="17.0.6+10-LTS", service.name="unknown_service:java", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.24.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.mongo-3.1, version=1.24.0-alpha, schemaUrl=null, attributes={}}, name=aggregate ipt-test.stage1, kind=CLIENT, startEpochNanos=1679093603422608400, endEpochNanos=1679093603428358000, attributes=AttributesMap{data={db.statement={"aggregate": "stage1", "pipeline": [{"$match": {}}, {"$group": {"_id": "?", "n": {"$sum": "?"}}}], "cursor": {}, "$db": "?", "lsid": {"id": "?"}}, thread.id=51, db.system=mongodb, db.connection_string=mongodb://localhost:27017, net.peer.name=localhost, applicationinsights.internal.item_count=1, thread.name=Thread-20, net.peer.port=27017, db.name=ipt-test, db.operation=aggregate, db.mongodb.collection=stage1}, capacity=128, totalAddedValues=11}, totalAttributeCount=11, events=[], totalRecordedEvents=0, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=UNSET, description=}, hasEnded=true}

Looks like the parentSpanId is invalid (**0000000000000000**), that's why distributed tracing is broken. I need to follow up with OpenTelemetry java instrumentation maintainer to find out more. I tested the latest OpenTelemetry 1.24.0, same outcome.

heyams avatar Mar 17 '23 23:03 heyams

@abhikt48

You can achieve this by manually propagating context from the main thread to the new thread:

In your pom.xml, add this dependency:

<dependency>
			<groupId>io.opentelemetry</groupId>
			<artifactId>opentelemetry-api</artifactId>
			<version>1.24.0</version>
</dependency>

And then add the following TestAiAgentWithMultiTransports.java:

io.opentelemetry.context.Context context = io.opentelemetry.context.Context.current();
			Thread t1 = new Thread(new Runnable() {
			    @Override
			    public void run() {
					try (Scope scope = context.makeCurrent()) {
						uploadBlob(body);
						sendMsgToQueueStorage(body);
						countMongoDbDocuments();
					}
			    }
			});
			t1.start();

And then you will get what you need:

image

Please let me know the outcome. Thanks!

heyams avatar Mar 20 '23 17:03 heyams

Hi @heyams,

Thank you so much for providing the details. I did testing based on your code change suggestion and I can see dependency tree as well.

Unfortunately, this will not solve our problem. Because,

  1. we are facing this issue in MuleSoft ESB, where all components belong to a individual (MuleSoft) library
  2. Where we can't do modification, such as setting context in parent thread and access context in child thread

As per below image, we are receiving a message from ServiceBus (Thread T1) where we can't set context. We have control on child thread T2, where I can set context but that is not working as expected.

OpenTel_Context

I did testing with Java as well where I set context in child thread instead of parent thread but that is not working.

image

As you have rightly said that we are loosing parentSpanId - I printed context before and after switching the thread and realize that context is getting lost.

** contextBefore 'AgentContextWrapper{agentContext={opentelemetry-trace-span-key=SdkSpan{traceId=9bbb1*****}'
** contextAfter 'AgentContextWrapper{agentContext={}, applicationContext={}}'

Could you please advice something else which could solve this issue? Or worth to raise defect to opentelemetry? Please do the needful.

Regards, Abhishek Kumar

abhikt48 avatar Mar 21 '23 17:03 abhikt48

@abhikt48 Can you reach out to ServiceBus to propagate the context for your app? @trask might have better insights for the next step. It's not really a defect of OpenTelemetry as far as I understand.

heyams avatar Mar 23 '23 21:03 heyams

@heyams - I shared here only one example of a source which is ServiceBus. As part of our organization, we have more than 120+ applications where we have different-different sources such as FTP, ServiceBus, HTTP, Azure Storage, JMS, etc.

It is not feasible for us to modify all sources.

Can you please suggest -

  1. Which API is responsible for transferring the context from one Thread to another? applicationinsights-agent or opentelemetry-api?
  2. agentContext should be automatically transferred between threads - Is this the correct expectation?
  3. If this feature is not there then can you please provide this feature? If not default, then provide some option to pass a custom argument that enables this feature.

We have a very high usage of AI, this will be a very great feature for us. Please do the needful.

FYI - @trask

abhikt48 avatar Mar 24 '23 16:03 abhikt48

image

If you use image instead of MuleSoft, telemetry is auto collected by our agent. More details here

heyams avatar Mar 24 '23 16:03 heyams

@heyams Thanks for the feedback. I will do the testing with above SDKs and update feedback. I am on leave for next week, so i will update testing results after 3rd April.

Do we have any option where we can initiate completely new agentContext for child thread and bind all transaction of child thread?

image

abhikt48 avatar Mar 24 '23 18:03 abhikt48

@heyams - Apologies for the late feedback as I was on long leave.

I did testing with SBUS SDK - azure-messaging-servicebus:7.13.0 and it is not working as expected. I can't see dependency tree with all other calls.

Sample class to replicate this issue- TestAiWithLatestSbusSdk.java

Could you please try to replicate this issue?

Can I request you to provide this feature for all clients not only with latest Azure Libraries ? As you know AI is very widely used, so it will be very good feature.

Please let me know for any information.

Regards, Abhishek Kumar

abhikt48 avatar Apr 25 '23 16:04 abhikt48

Hi @heyams - Did you get a chance to take a look at the above issue?

abhikt48 avatar May 05 '23 10:05 abhikt48

@abhikt48 will look into it. thanks.

heyams avatar May 08 '23 17:05 heyams

@abhikt48 i'm seeing exceptions in the log (using servicebus 7.13.0)

java.lang.NoClassDefFoundError: com/azure/core/amqp/implementation/handler/ReceiverUnsettledDeliveries at com.azure.messaging.servicebus.implementation.ServiceBusReactorReceiver.(ServiceBusReactorReceiver.java:80) at com.azure.messaging.servicebus.implementation.ServiceBusReactorSession.createConsumer(ServiceBusReactorSession.java:170) at com.azure.core.amqp.implementation.ReactorSession.getSubscription(ReactorSession.java:618) at com.azure.core.amqp.implementation.ReactorSession.lambda$createConsumer$10(ReactorSession.java:398) at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1916) at com.azure.core.amqp.implementation.ReactorSession.lambda$createConsumer$11(ReactorSession.java:382) at com.azure.core.amqp.implementation.handler.DispatchHandler.onTimerTask(DispatchHandler.java:32) at com.azure.core.amqp.implementation.ReactorDispatcher$WorkScheduler.run(ReactorDispatcher.java:207) at org.apache.qpid.proton.reactor.impl.SelectableImpl.readable(SelectableImpl.java:118) at org.apache.qpid.proton.reactor.impl.IOHandler.handleQuiesced(IOHandler.java:61) at org.apache.qpid.proton.reactor.impl.IOHandler.onUnhandled(IOHandler.java:390) at org.apache.qpid.proton.engine.BaseHandler.onReactorQuiesced(BaseHandler.java:87) at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:206) at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324) at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:292) at com.azure.core.amqp.implementation.ReactorExecutor.run(ReactorExecutor.java:91) at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68) at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833) Caused by: java.lang.ClassNotFoundException: com.azure.core.amqp.implementation.handler.ReceiverUnsettledDeliveries at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520) ... 24 more

heyams avatar May 09 '23 19:05 heyams

Closing this issue for now, as we are focusing on Mule 4 with codeless agent instead of Mule 3.

Thank you !

abhikt48 avatar May 17 '24 10:05 abhikt48