sentry-java icon indicating copy to clipboard operation
sentry-java copied to clipboard

Android transactions lasting for days, TTID halfway in, all spans at the very end.

Open maxkosty opened this issue 1 year ago • 7 comments

Integration

sentry-android

Build System

Gradle

AGP Version

unknown

Proguard

Enabled

Version

7.2.0

Steps to Reproduce

Customer is observing transactions lasting for multiple days where (👉 see linked Jira issue for examples)

About 60-70% of all transactions over 1 hour (about 1% of total transactions) exhibit this pattern:

  • 2 days until TTID,
  • (sometimes) then another day of no spans
  • and finally normal spans at the end culminating in transaction being finished 3day-transaction

(this is the main pattern and the one that I we don't have explanation for, there’s also a second, less common, pattern among extra-long transactions where all spans happen early on, except for one long file.read that delays TTID - this is out of scope of this report, please ignore those).

Below is customer's configuration:

Settings set during application oncreate via dagger/anvil DI

override fun configure(options: SentryAndroidOptions) {
        dsnValidator.ensureDsnIsValid(sentryOptions.dsn)
        options.dsn = sentryOptions.dsn
        options.isAttachScreenshot = false
        options.isAnrEnabled = sentryFeatureGates.get().isAnrEnabled()
        options.tracesSampleRate = sentryFeatureGates.get().getTracesSampleRate()
        options.isAttachAnrThreadDump = true
        options.environment = buildInfo.getBuildChannel().name.lowercase()
        options.setBeforeSend { event, _ ->
            tagDecorators.forEach {
                event.setTag(it.key, it.value.get())
            }
            event
        }

        options.addInAppInclude("com.redactedcompanyname")

        // Remove the Timber integration so that we can use our own
        options.integrations.removeIf { it is SentryTimberIntegration }

        if (sentryFeatureGates.get().isFragmentsEnabled()) {
            options.addIntegration(
                FragmentLifecycleIntegration(
                    context.applicationContext as Application,
                    enableFragmentLifecycleBreadcrumbs = true,
                    enableAutoFragmentLifecycleTracing = true,
                ),
            )
            options.isEnableUserInteractionTracing = true
            options.isEnableUserInteractionBreadcrumbs = true
        }

// MANIFEST
    <application>
        <provider
            android:name="io.sentry.android.core.SentryInitProvider"
            android:authorities="${applicationId}.SentryInitProvider"
            tools:node="remove" />

        <provider
            android:name="io.sentry.android.core.SentryPerformanceProvider"
            android:authorities="${applicationId}.SentryPerformanceProvider"
            tools:node="remove" />
    </application>

Gradle Plugin

   includeProguardMapping.set(true)
                includeDependenciesReport.set(true)
                includeNativeSources.set(true)
                includeSourceContext.set(true)

                tracingInstrumentation.enabled.set(true)
                autoInstallation.enabled.set(true)

Expected Result

Transaction start coincides with start timestamp of the first span.

Actual Result

Start timestamp of the first span lags days behind start of transaction.

┆Issue is synchronized with this Jira Improvement by Unito

maxkosty avatar Apr 05 '24 17:04 maxkosty

Same customer also experiencing https://github.com/getsentry/sentry-java/issues/3145 FWIW

maxkosty avatar Apr 06 '24 00:04 maxkosty

@stefanosiano theorized that this could be something related to content providers and also noticed that there are several starting points (and app start transactions) for the app in which it was reported (see linked Jira).

maxkosty avatar Apr 08 '24 17:04 maxkosty

@stefanosiano @markushi were you able to reproduce this? in any case, let's prio this next

kahest avatar Apr 29 '24 10:04 kahest

Please note, if this is relevant, multiple http requests in this transaction, all showing deadline_exceeded for the body, but headers were ok.

I've added conversation with customer to linked Jira issue

maxkosty avatar Apr 30 '24 00:04 maxkosty

Maybe can be combined with https://github.com/getsentry/sentry-java/issues/3084

markushi avatar May 29 '24 13:05 markushi

I looked a bit deeper into the activities that are most likely to manifest this bug and I think it may be related to finishing the activity in the onCreate block, resulting in onResume never getting called. Example

override fun onCreate() {
  if (someCondition) {
    startActivity(intent)
    finish()
  }
  // other code
}

devPalacio avatar May 29 '24 13:05 devPalacio

@devPalacio thanks for this detail, highly appreciated - we'll look into this

kahest avatar May 29 '24 14:05 kahest

@realkosty @devPalacio The fix was released in the SDK version 7.12.1. Please, reopen this issue in case the problem persists.

stefanosiano avatar Jul 25 '24 13:07 stefanosiano