opentelemetry-js icon indicating copy to clipboard operation
opentelemetry-js copied to clipboard

Auto shutdown in collector-exporter can hurt performance in some browsers

Open kkruk-sumo opened this issue 4 years ago • 16 comments
trafficstars

  • [X] This only affects the JavaScript OpenTelemetry library
  • [ ] This may affect other libraries, but I would like to get opinions here first

Hi 👋

@opentelemtry/exporter-collector by default adds event listener for the unload event on browser platforms. According to MDN it can hurt website's performance when doing back/forward buttons:

Additionally, the unload event is incompatible with the back/forward cache (bfcache) implemented in modern browsers. Some browsers, such as Firefox, handle this incompatibility by excluding pages from the bfcache if they contain unload handlers, thus hurting performance. See Avoid unload and beforeunload for more.

Also I'm not sure what's the purpose of this behaviour? shutdown only returns a promise of already exported items, so doing it on unload doesn't make more sense, does it? See the code I'm referring to.

Additionally, onShutdown is empty for node platform, so we could just remove the onInit and onShutdown functionalities. Am I missing something?

kkruk-sumo avatar May 13 '21 15:05 kkruk-sumo

CC @davidwitten and @dyladan

kkruk-sumo avatar May 13 '21 15:05 kkruk-sumo

Agree we shouldn't "shutdown" on unload, but we should perform a synchronous flush to ensure all telemetry is sent -- unless we have some sort of backing store (disk for node, and LocalStorage / IndexedDb (Slow!!!)) to ensure we can replay any unsent events on reload etc

MSNev avatar Jun 11 '21 18:06 MSNev

Is there any way I can flush synchronously?

vanand-conga avatar Nov 11 '21 08:11 vanand-conga

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

github-actions[bot] avatar Jun 13 '22 06:06 github-actions[bot]

Is this still valid after https://github.com/open-telemetry/opentelemetry-js/pull/2243 is landed?

legendecas avatar Jun 14 '22 08:06 legendecas

I think we are still using the unload event right? So the performance issue would still be there.

dyladan avatar Jun 15 '22 12:06 dyladan

Is there any way I can flush synchronously?

No sorry

dyladan avatar Jun 15 '22 15:06 dyladan

Hi - Any update on this? This bug hurting us is causing low score of our website by PageSpeed Insights.

vipulkv avatar Jul 25 '22 10:07 vipulkv

Is this still valid after #2243 is landed?

It's still valid. #2243 was about BatchSpanProcessor, and the unload listener is still in otlp-exporter https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/otlp-exporter-base/src/platform/browser/OTLPExporterBrowserBase.ts#L60 .

kkruk-sumo avatar Jul 25 '22 13:07 kkruk-sumo

Agree we shouldn't "shutdown" on unload, but we should perform a synchronous flush to ensure all telemetry is sent -- unless we have some sort of backing store (disk for node, and LocalStorage / IndexedDb (Slow!!!)) to ensure we can replay any unsent events on reload etc

@MSNev how would you propose to perform a synchronous flush?

dyladan avatar Jul 25 '22 14:07 dyladan

For browsers runtimes, there is not a lot of choice to ensure that we can get traces of the box, general options are

  • synchronous XHR (blocks UI, but also getting disabled by modern browsers) -- doesn't have size limits, required for IE
  • sendBeacon -- has 64kb payload limit. And this is "unsent" not per-request (you can't just do lots of sendBeacon requests)
  • fetch (with keep-alive) -- has same issue as sendBeacon
  • XHR / fetch (for larger payloads) -- using these are extremely problematic in that browsers can abort requests before the full payload is sent (resulting in nothing but the headers getting sent -- or nothing). I've seen this many times internally. So really not a viable option.

I don't know, if node (http) suffers from these same issues -- I would assume not.

Resiliency, by using LocalStorage / SessionStorage is really the only option (as IndexedDb is an asynchronous API). But that too can cause storage issues / limits for the domain. eg. If the hosting application (page) also needs to store "lots" of data we can't really store too much.

MSNev avatar Jul 25 '22 19:07 MSNev

There are not really similar issues in node, but there are different ones. There is no way to make a synchronous network request in node without hacky solutions like calling a subprocess synchronously.

dyladan avatar Jul 26 '22 13:07 dyladan

Isn't sendBeacon and fetch both async? The only sync option is sync XHR which has problems like you mentioned

dyladan avatar Jul 26 '22 14:07 dyladan

Isn't sendBeacon and fetch both async? The only sync option is sync XHR which has problems like you mentioned

I believe sendBeacon API is sync however the data is sent asynchronously, you just never get the callback. It ensentially just push the request into a queue that is handled even if the page is closed.

vmarchaud avatar Jul 27 '22 19:07 vmarchaud

Got it. So sendBeacon is our best bet on shutdowns and if we have to use xhr we just have to hope

dyladan avatar Jul 28 '22 18:07 dyladan

just checking, can we use the visibilityChange event instead? This is in the MDN documentation for the sendBeacon. If possible, though this would definitely add to the server handling, you could send over multiple sendBeacon requests. I know not ideal but still possible.

So overall, to the original, we should use the visibilityChange event as this is also helpful for mobile (another reason to avoid `unload).

jscherer92 avatar Aug 31 '22 01:08 jscherer92

Closing this as #4438 has landed. If this is still an issue, please reach out and I'll re-open.

pichlermarc avatar Mar 08 '24 17:03 pichlermarc