cpp_client_telemetry icon indicating copy to clipboard operation
cpp_client_telemetry copied to clipboard

Pause Manager causes events not being uploaded on FlushAndTeardown

Open maxgolov opened this issue 1 year ago • 2 comments

Problem Description

It seems like a change from 2021 is affecting the FlushAndTeardown sequence: https://github.com/microsoft/cpp_client_telemetry/pull/829

Steps to reproduce.

  • Set maxTeardownUploadTimeInSec to 2-3-5 seconds.
  • Run a very short-lived testapp.
  • Observe the FlushAndTeardown method spinning aimlessly and not actually uploading.

What is the expected behavior?

Expected behavior is that the 2-3 seconds are spent on trying to upload pending requests. Once there are zero events left, app should unblock and exit. Ideally with evt_stats event turned off, it should be relatively fast exit. Why need to turn evt_stats off - it's an oversubscribed event that'd generate 401/403, thus it's a waste. And should be removed out of the equation.

What is the actual behavior?

SDK spins and doesn't upload anything at the end of the run.

Additional context.

I suspect these pieces of code related to PauseGuard are causing the problem:

https://github.com/microsoft/cpp_client_telemetry/blob/0177277d1dc194a7703ca87d40587b6148608b45/lib/tpm/TransmissionPolicyManager.cpp#L105

Because activity is paused here at the beginning of the FlushAndTeardown method, without actually waiting for events cache to be fully drained: https://github.com/microsoft/cpp_client_telemetry/blob/0177277d1dc194a7703ca87d40587b6148608b45/lib/api/LogManagerImpl.cpp#L380

Instead of pausing - it's supposed to wait for the amount of time configured. Obviously FlushAndTeardown can't be called on mobile platforms on the main thread. It has to be called on the background thread. Calling this on foreground thread may cause ANR, and app getting killed.

The way it stands, it seems like short-lived apps and processes would experience "data loss", because their events get stuck and not attempted to be uploaded at the end of the short-lived app run.

maxgolov avatar Mar 30 '23 04:03 maxgolov