ApplicationInsights-Java icon indicating copy to clipboard operation
ApplicationInsights-Java copied to clipboard

Startup impact of application insight on microservices startup

Open Sachin1O1 opened this issue 2 years ago • 49 comments

What's the minimum / maximum startup impact of application insight on microservices startups?

Sachin1O1 avatar Dec 10 '21 10:12 Sachin1O1

any update?

Sachin1O1 avatar Dec 23 '21 09:12 Sachin1O1

hi @Sachin1O1, it depends a lot on the application and the compute resources. there's also additional overhead on Java 8 because the agent jar is signed, and the Java JIT compiler is not activated until later on in the startup process on Java 8 (this is not an issue on Java 11).

trask avatar Jan 03 '22 18:01 trask

Hi @trask , I have some stats on how my service startup time is impacted by the application insight jar

Service startup sample

Sr no With jar Without jar
1 260.21 102.6
2 170 103
3 274 180
4 165 99

Container resources

resources: limits: cpu: 300m memory: 700Mi requests: cpu: 100m memory: 400Mi

Java

Java 11 (OpenJ9)

As you can see it inc startup time by almost X2

Sachin1O1 avatar Jan 18 '22 07:01 Sachin1O1

I'm curious to know which version of agent were you testing against? Did you try the latest GA/BETA version?

heyams avatar Jan 18 '22 20:01 heyams

Hi @heyams,

I am using Application Insights Java 3.2.4 (GA)

Sachin1O1 avatar Jan 19 '22 06:01 Sachin1O1

@trask @heyams any update on this.

Sachin1O1 avatar Feb 03 '22 08:02 Sachin1O1

i'm working on a debug startup profiler.. so that you can provide thread dump to us for further debugging. we're not able to repro it.

heyams avatar Feb 04 '22 04:02 heyams

@Sachin1O1 can you use this 3.2.6-BETA-SNAPSHOT and help us collect thread dump from your app?

Please set -Dapplicationinsights.debug.startupProfiling to true in your jvm arguments: java -javaagent:applicationinsights-agent-3.2.6-BETA-SNAPSHOT.jar -Dapplicationinsights.debug.startupProfiling=true.

Let it run till your app has started. Then send me ([email protected]) the file called stacktrace.txt located in your temp folder. The full path is available in the applicationinsights.log. Something like C:\Users\${USER-NAME}\AppData\Local\Temp\applicationinsights

heyams avatar Feb 07 '22 21:02 heyams

@Sachin1O1 i updated the snapshot link above. earlier version was no good. please give it a try and get back to us when you can. thanks.

heyams avatar Feb 07 '22 23:02 heyams

@heyams Thanks for the profiler. I will update you with the relevant data ASAP. :)

Sachin1O1 avatar Feb 08 '22 08:02 Sachin1O1

Hi @heyams, we did some testing with the StartupProfiler, and we will share the stacktrace.txt with you on e-mail. But we encountered a warning and one exception.

First was the warning that InstrumentationKey is missing.

2022-02-11 13:27:10.002Z WARN c.m.a.a.i.telemetry.ConnectionString - Missing Statsbeat 'InstrumentationKey'

I was looking at the code, https://github.com/microsoft/ApplicationInsights-Java/blob/a829ca9ee3cf2572503c6bb726897cdda708fe57/agent/agent-tooling/src/main/java/com/microsoft/applicationinsights/agent/internal/init/LazyConfigurationAccessor.java#L65-L67

And it seems to originate here: https://github.com/microsoft/ApplicationInsights-Java/blob/a829ca9ee3cf2572503c6bb726897cdda708fe57/agent/agent-tooling/src/main/java/com/microsoft/applicationinsights/agent/internal/telemetry/ConnectionString.java#L106

Not sure if I'm in the correct place, but we haven't used the APPINSIGHTS_INSTRUMENTATIONKEY environment variable, but rather the APPLICATIONINSIGHTS_CONNECTION_STRING. And looking at the documentation for connection strings, it seems to me like it should be getting InstrumentationKey from the APPLICATIONINSIGHTS_CONNECTION_STRING if that's present in the environment, instead of warning about a missing variable.

The other was a NPE capture thread dump method.

Exception in thread "StartupProfiler" java.lang.NullPointerException
	at com.microsoft.applicationinsights.agent.StartupProfiler$ThreadDump.captureThreadDump(StartupProfiler.java:106)
	at com.microsoft.applicationinsights.agent.StartupProfiler$ThreadDump.run(StartupProfiler.java:90)
	at java.base/java.lang.Thread.run(Thread.java:829)

At first glance I was unable to figure out if this case was handled, so I'm not sure if the thread dump exited prematurely.

kbjerke avatar Feb 11 '22 14:02 kbjerke

we fixed the following warning in 3.2.6 GA. Please try 3.2.6 GA. StartupProfiler is available in that release as well. 2022-02-11 13:27:10.002Z WARN c.m.a.a.i.telemetry.ConnectionString - Missing Statsbeat 'InstrumentationKey'

heyams avatar Feb 11 '22 20:02 heyams

i also double checked the NPE iin StartupProfiler.. it's inside a for loop. i definitely recommend you trying out the 3.2.6 GA version and let us know how it goes. thanks.

heyams avatar Feb 11 '22 20:02 heyams

Hi @heyams, We used the latest version 3.2.6 GA. and the code snippet shared was from the latest code.

Sachin1O1 avatar Feb 14 '22 07:02 Sachin1O1

can you share your applicationinsights.log and applicationinsights.json with me at [email protected]?

heyams avatar Feb 14 '22 20:02 heyams

Hi @heyams

I can share it here

mycontainer:$ ls agent.jar app.jar applicationinsights.json applicationinsights.log tmp mycontainer:~$ cat applicationinsights.log 2022-02-15 13:20:08.232Z WARN c.m.a.a.i.telemetry.ConnectionString - Missing Statsbeat 'InstrumentationKey' mycontainer:$ cat applicationinsights.json { "customDimensions": { }, "instrumentation": { "logging": { "level": "ALL" }, "micrometer": { "enabled": false } }, "preview": { "sampling": { "overrides": [ { "attributes": [ { "key": "http.url", "value": "(https?://[^/]+/([a-z]|[a-z]+-[a-z])+/monitor/+[a-z/]+$)|(https?://[^/]+/management/actuator/+[a-zA-Z/]+$)|(grpc.health.v1.Health/Check)", "matchType": "regexp" } ], "percentage": 1 } ] } }, "selfDiagnostics": { "destination": "file+console", "level": "WARN", "file": { "path": "applicationinsights.log", "maxSizeMb": 5, "maxHistory": 1 } } }

Sachin1O1 avatar Feb 15 '22 14:02 Sachin1O1

@Sachin1O1 you don't have a connection string? or you omitted it on purpose because sharing it publicly here? can you email me the applicationinsights.log?

heyams avatar Feb 15 '22 22:02 heyams

@Sachin1O1 I have tested it (3.2.6 GA) using connection string & instrumentation key env vars, not getting the NPE. log will be helpful in this case.

heyams avatar Feb 15 '22 23:02 heyams

@heyams I've seen that NPE before, we need to add a null check, e.g.

https://github.com/glowroot/glowroot/blob/c7f1c9cb993e796a55fef0ef54839dead8a78dcf/agent/core/src/main/java/org/glowroot/agent/live/ThreadDumpService.java#L63-L65

trask avatar Feb 16 '22 01:02 trask

@Sachin1O1 can you try this snapshot. I fixed the NPE. (#2124)

heyams avatar Feb 16 '22 20:02 heyams

Hi @heyams, It worked and now I see no NPE. I can share stacktrace.txt file now but it is almost 300MB and increasing so I suspect the profiler never stop and keep writing to stacktrace file.

Sachin1O1 avatar Feb 17 '22 11:02 Sachin1O1

it will run for 10 mins and then stop on its own. can you share the stacktrace.txt somewhere? blob storage? @Sachin1O1

heyams avatar Feb 17 '22 18:02 heyams

it's worth trying to zip and attach (or email), it will likely compress really well due to lots of duplicate stack traces

trask avatar Feb 17 '22 18:02 trask

Hi @heyams and @trask , I Compressed the stack trace and emailed it to [email protected].

Sachin1O1 avatar Feb 21 '22 13:02 Sachin1O1

@heyams @trask any update?

Sachin1O1 avatar Feb 25 '22 08:02 Sachin1O1

@trask @heyams any update here?

Sachin1O1 avatar Mar 17 '22 05:03 Sachin1O1

Will sync with @trask and get back to you soon.

heyams avatar Mar 17 '22 17:03 heyams

hey @Sachin1O1, we reviewed the stack traces and didn't find anything that we can address for your case in the short-term. there was one startup improvement we made in 3.2.7 in case you haven't updated to that yet.

trask avatar Mar 17 '22 19:03 trask

@trask do we have any info on the minimum resource requirement for the java agent?

Sachin1O1 avatar Apr 11 '22 07:04 Sachin1O1

@trask do we have any info on the minimum resource requirement for the java agent?

do you mean expected startup impact? that can vary really widely from application to application, so it's really hard to give any specific guidance other than that you probably need to measure it on your specific application

btw, there was some promising startup performance investigation and improvement done upstream just recently, I'm working on pulling that into Application Insights and will get you a new SNAPSHOT build to try out in the next couple of days

trask avatar Apr 11 '22 21:04 trask