stream-video-android icon indicating copy to clipboard operation
stream-video-android copied to clipboard

Crash - SIGSEGV (This issue has 2,546 crash events affecting 989 users)

Open buddy-dev-1337 opened this issue 8 months ago • 21 comments

Describe the bug This happens intermittently, just try to place a call to someone and we see this.

One interesting thing got from Crashlytics is that for all these crashes the memory in RAM is around 100-200MB at the time of the crash, which is weird because these are mid-high end devices with 4-6GB RAM

Gemini Gist: A SIGSEGV (Segmentation Violation) in libjingle_peerconnection_so.so indicates a memory access violation within the WebRTC library. This usually means your code is trying to read or write to a memory address it shouldn't access – possibly because the memory is unallocated, already freed, or doesn't have the correct permissions. This often happens due to:

SDK version

  • (io.getstream:stream-video-android-ui-compose:1.5.0)

To Reproduce Steps to reproduce the behavior:

  1. Try to call someone with ringing true
  2. Decline the call from the other side
  3. Observe the crash

Expected behavior Should not crash the app.

Device:

  • Vendor and model: Vivo, Oppo, Xiaomi
  • Android version: Android 14, 13, 11, 10

Screenshots If applicable, add screenshots to help explain your problem.

Image

Logs

stacktrace.txt

buddy-dev-1337 avatar May 03 '25 06:05 buddy-dev-1337

Along the same lines as above we are also seeing crashes with

  1. SIGILL - This issue has 1,274 crash events affecting 722 users
  2. SIGABRT - This issue has 682 crash events affecting 291 users

One thing that stands out with all these issues is that memory in RAM is around 100-200MB, sometimes even less than 50MB at the time of the crash

buddy-dev-1337 avatar May 03 '25 06:05 buddy-dev-1337

Hi there, Thanks for the report.

I suggest you update to 1.6+ where a OOM and some race conditions were adressed.

Regards, Alex

aleksandar-apostolov avatar May 13 '25 09:05 aleksandar-apostolov

@aleksandar-apostolov we released with 1.6.1 but the issue still persists.

Here is the flow of the app 1 - there is CallConnectActivity which initiates a call by taking in userID as the intent 2 - this userID is fed into the call object using call.create(with ring true) and timeout of 10s 3 - since the other side does not respond we get a CallRejectedEvent on which we perform call.end() 4 - delay(500) 5 - start from step 1 with a different userID from the queue 6 - Crash happens

Initially we were not doing call.end() when the other side did not respond and would proceed calling the next person in the queue, but this also did crash.

Question - what is the right way to keeping calling members from a queue? Expectation - try 1st person from the queue, if connects, well & good. If not what are the correct steps to close the previous call and start new call to the next person in the queue, this crash only happens in this continuous flow.

Your help is highly appreciated, our crash-free rate stand at 90% currently and we want to improve it.

buddy-dev-1337 avatar May 13 '25 15:05 buddy-dev-1337

Hi @buddy-dev-1337

You also need to call call.leave() to cleanup internal state. Without having knowledge on your integration I would suggest if not already to create a call object before each process via client.call(type, id).

Regards, Alex

aleksandar-apostolov avatar May 15 '25 07:05 aleksandar-apostolov

@aleksandar-apostolov call.end() internally calls call.leave() so I can still use call.end() right?

/** ends the call for yourself as well as other users */
suspend fun end(): Result<Unit> {
    // end the call for everyone
    val result = clientImpl.endCall(type, id)
    // cleanup
    leave()
    return result
}

buddy-dev-1337 avatar May 16 '25 00:05 buddy-dev-1337

Yes, but when you do call.end() this makes the call marked as ended on the backend. If you want to continue using the call you can call call.leave() or call.reject() on the side that needs to leave. Technically you can rejoin an ended call depending on permissions, but this depends on the usecase.

Would you like to share how you integrate this? I need to know how are you "calling the next member".

Regards, Alex

aleksandar-apostolov avatar May 16 '25 08:05 aleksandar-apostolov

Hello @aleksandar-apostolov sharing the code for the Activity which creates the call and if someone rejects or does not pick we move to picking next random member from the list.

I did upgrade the SDK to 1.6.1 but the issue persists.

Core functions of the Activity

  • initiateCall (responsible for initiating a call given an ID, on CallRejectedEvent we move to the below step)
  • startRandomFlow (picks random member from the list and initiates call with them)

Gist - https://gist.github.com/buddy-dev-1337/930041dfec734ba44bb69d3a7838370f

buddy-dev-1337 avatar May 17 '25 15:05 buddy-dev-1337

Hello @aleksandar-apostolov did you get the chance to look at the code in the Gist? Happy to share additional info if needed

buddy-dev-1337 avatar May 19 '25 17:05 buddy-dev-1337

Not yet, I will try to integrate your activity in our sample app to debug it. I will keep you posted.

Regards, Alex

aleksandar-apostolov avatar May 20 '25 10:05 aleksandar-apostolov

Hello @aleksandar-apostolov did you get the chance to integrate the code in the sample app?

buddy-dev-1337 avatar May 22 '25 04:05 buddy-dev-1337

Hello @aleksandar-apostolov one more insight that we got now is that the phone has enough RAM before the crash happens. Initially my understanding was that memory leaks might be causing this issue but that is not the case.

Adding detailed log that contains RAM info before the crash

live.buddycall.chat_issue_1a87749c9ef5361ec81c907ac1054ac9_crash_session_683377cf021f00017ebeb46cb4476c69_DNE_0_v2.log

buddy-dev-1337 avatar May 26 '25 01:05 buddy-dev-1337

Hello @aleksandar-apostolov is there any update on this? were you guys able to reproduce this? happy to give more details if needed

buddy-dev-1337 avatar May 29 '25 14:05 buddy-dev-1337

Hi @buddy-dev-1337

Thanks for sharing the gist and reporting the issue.

To help us debug this more effectively, could you please:

  1. Clean up the code in the shared gist by removing parts that are not related to the Stream SDK. This will make it easier to isolate the problem.

  2. Point us to the exact line you believe is causing the crash. As a tip, you can add logs at each key step to narrow it down further.

  3. From what I can see currently, the issue might be related to stale Call objects not being cleared properly, or perhaps they are being reused unknowingly. In particular, the usage of currentCall = call inside your suspend fun initiateCall(...) looks suspicious and may be contributing to the problem.

  4. I'd also suggest using a dedicated coroutine scope to observe call-related events, and making sure to dispose of that scope when it's no longer needed. Right now, it seems like the event observers are active indefinitely, which might be leading to unintended side effects. (Though I may be mistaken — feel free to correct me.)

  5. I also couldn't see the logic where you're launching the activity or UI to render the video or audio call. Could you clarify how the call screen is being launched after initiating a call?

  6. Lastly, please share after how many call attempts the crash typically occurs, and whether you're reusing the same Call object across those attempts.

Looking forward to your response so we can help resolve this faster.

Best regards,

rahul-lohra avatar Jun 10 '25 05:06 rahul-lohra

Hello @rahul-lohra here are the updates:

  1. modified the gist to only contain Stream's call flow - https://gist.github.com/buddy-dev-1337/930041dfec734ba44bb69d3a7838370f

  2. it always crashes when we receive call ended or rejected event, on basis of these events we try to initiate another call to a random person from the queue

  3. we always make to sure call call.leave before a new call is placed

  4. yes, I can try making this change in the next release

  5. it is just an intent launch being performed from a Fragment which passes the hostId, callType & hostName as String args

  6. this is intermittent, there is no pattern to reproduce this, locally I'm able to reproduce this very rarely 1-2 times but in production this is the TOP crash for us.

buddy-dev-1337 avatar Jun 15 '25 09:06 buddy-dev-1337

Thank you for cleaning up the code in the gist.

I had a few observations and a request:

  1. Would it be possible for you to share the crash stacktrace? That would be very helpful for us in narrowing down the root cause.

  2. Could you also share the manifest flag values applied on CallActivity? Tip: It’s best to share this from the merged manifest, as it reflects the final values after build-time merges.

  3. I suspect the following lines might be contributing to the crash:

  • startActivity(CallActivity::class.java)
  • (event is CallEndedEvent || event is CallRejectedEvent)
callResult
    ?.onSuccess {
        startActivity(CallActivity::class.java) // T
    }
    ?.onError {
        finish()
    }

call?.events?.collect { event ->
    if (event is CallEndedEvent || event is CallRejectedEvent) {
        call?.leave()
        initiateCall(callType, newHostID, newHostName) // Try the next person from the queue
    }
}

Looking at the above code, it seems possible that multiple instances of CallActivity could be launched if the previous one hasn't finished before a new one is started asynchronously.

Suggestion: To prevent this, consider ensuring that CallActivity is fully destroyed before calling startActivity(CallActivity::class.java) inside callResult?.onSuccess. One way to track the activity's lifecycle could be using [Application.ActivityLifecycleCallbacks](https://developer.android.com/reference/android/app/Application.ActivityLifecycleCallbacks). This would help you monitor the current state of CallActivity and avoid premature launches.

rahul-lohra avatar Jun 16 '25 08:06 rahul-lohra

hello @rahul-lohra

  1. attaching 3 unique stack traces of the crash live.buddycall.chat_issue_1a87749c9ef5361ec81c907ac1054ac9_crash_session_68513d000345000140c5ab402af4d153_DNE_0_v2_stacktrace.txt live.buddycall.chat_issue_1a87749c9ef5361ec81c907ac1054ac9_crash_session_6851556702b60001104b28c6e347cd15_DNE_0_v2_stacktrace.txt live.buddycall.chat_issue_1a87749c9ef5361ec81c907ac1054ac9_crash_session_6851615b0144000133110a3ed249ec69_DNE_0_v2_stacktrace.txt

  2. added manifest entry for CallActivity Image

  3. we use a Throttler before starting the CallActivity, so I don't think this will be causing the crash

throttler.publish {
            CallActivity.start(this, call?.id.toString(), call?.type.toString())
            finish()
          }

let me know if anything else is required.

Also, I'd like to know what is the ideal way of handling the following use-case.

  1. Pick an ID to call, create the call with ring=true and timeout of 10s
  2. If other participant does not pick, try another ID with same timeout as above
  3. In my current implementation I'm making sure StreamVideo is always the same and make sure we call call.leave() before we try the next call

what else am I missing? happy to get on call to give more context

buddy-dev-1337 avatar Jun 17 '25 13:06 buddy-dev-1337

Hi, Thanks for having patience I have some updates for you. So I made a small in our demo-app which has all your requirement. Based on this I can recommend some best practices to you

1. How to set the timer

You can go to your project dashboard and set the timer there. I can share you the screenshots such that it is easy to follow steps

Step 1 : Go to stream dashboard and select your app

Image

Step 2 : Go to Video and Audio on the left column and select Call type

Image

Step 3 : Select your resepctive call_type which you are using on the sdk

Image

Step 4 : Go to advance settings

Image

Step 5 : Set auto-cancel timeout

Image

Done

Best coding practice to keep calling users after the call is ended or rejected

  1. It is very IMPORTANT to ensure the previous call cleaned up, before starting new activity. So use a delay meachanism of 3 Seconds like I did (you can use any library as per your convenience)

  2. Please use this logic to collect first occurance of CallEndedEvent or CallRejectedEvent as sometimes you can get multiple


//REMOVE THIS

call?.events?.collect { event ->
    if (event is CallEndedEvent || event is CallRejectedEvent) {
        call?.leave()
        initiateCall(callType, newHostID, newHostName) // Try the next person from the queue
    }
}

//USE THIS
val event = myCall?.events?.first { event -> event is CallEndedEvent || event is CallRejectedEvent }
  1. You don't have to invoke call.leave() in this scenario as it is being internally done after the call is rejected or ended
val coroutineScope = rememberCoroutineScope()
        Button(onClick = {
            coroutineScope.launch {
                startNewRandomCall(context)
            }
        }) {
            Text("Start RandomCall")
        }
private var myCall: Call? = null

suspend fun startNewRandomCall(context: Context): Unit = supervisorScope {
    val users = arrayListOf<String>("rahula", "rahulb", "rahulc")
    myCall = StreamVideo.instance().call("audio_call", UUID.randomUUID().toString())
    val streamCallId = StreamCallId.fromCallCid(myCall!!.cid)
    val members = arrayListOf<String>(users.random())

    val intent = StreamCallActivity.callIntent(
        context = context,
        cid = streamCallId,
        members = members,
        action = NotificationHandler.ACTION_OUTGOING_CALL,
        clazz = CallActivity::class.java,
    )

    delay(3_000L) //IMPORTANT to ensure the previous call cleaned up, before starting new activity
    context.startActivity(intent)

    launch(Dispatchers.IO) {
        Log.d(TAG, "obseving myCall: ${myCall?.cid}")
        val event = myCall?.events?.first { event -> event is CallEndedEvent || event is CallRejectedEvent }
        if (event != null) {
            startNewRandomCall(context)
        }
    }
}

Hope this helps 👍

rahul-lohra avatar Jun 27 '25 05:06 rahul-lohra

hello @rahul-lohra I've made the changes you've suggested and updated the gist

https://gist.github.com/buddy-dev-1337/930041dfec734ba44bb69d3a7838370f

Please let me know if this looks correct based on this we'll make another release

buddy-dev-1337 avatar Jul 01 '25 16:07 buddy-dev-1337

@buddy-dev-1337 The code looks good to me. However, I don't understand what is this code

  • Is CallActivity from GetStream video sdk or yours ?
  • If it is, then why isn’t the 3-second delay applied before launching the activity?
  • Why call?.leave() is not invoked when consuming CustomVideoEvent? Is it not required?
  • Why is CustomVideoEvent triggered?
 when (event) {
      is CustomVideoEvent -> {
        val payload = event.custom["call_started"]
        if (payload != null) {
          throttler.publish {
            CallActivity.start(this, call?.id.toString(), call?.type.toString())
            finish()
          }
        }
      }

rahul-lohra avatar Jul 02 '25 05:07 rahul-lohra

Hello @rahul-lohra

  1. CallActivity is our custom activity where we run the call

  2. When the user tries to call for the very first time that 3s delay is not required but for subsequent calls after that, we do add the delay in the startRandomFlow function so that the previous call is cleaned and only then proceed to place the new call

  3. We rely on CustomVideoEvent to let the other participant know that they can now join the call ("call_started") in this call.leave() is not required as we are joining the call

Let me know if this clarifies things or if there are any more questions

buddy-dev-1337 avatar Jul 02 '25 18:07 buddy-dev-1337

Hi @buddy-dev-1337 All right. The good lgtm

rahul-lohra avatar Jul 07 '25 09:07 rahul-lohra