react-native-skia icon indicating copy to clipboard operation
react-native-skia copied to clipboard

Segfault `RNSkia::RNSkOpenGLCanvasProvider::surfaceSizeChanged`

Open hannojg opened this issue 1 year ago • 9 comments

Description

We are seeing crashes in sentry from react-native-skia on android.

Screenshot 2024-04-22 at 15 14 32

We are also seeing another stack around, which is maybe related:

Screenshot 2024-04-22 at 15 33 52

Version

1.2.3

Steps to reproduce

Sorry, this is where it gets vague 🙇 We can't reproduce the crash on our phones.

All we know is that the crash happens when we navigate away from a screen where we render a rnskia surface.

Note: we see the crash happening also on earlier versions of skia. It was first reported to us in sentry on rnskia version: 1.0.4

Snack, code example, screenshot, or link to a repository

The stack is unfortunately not very long, I understand that a full reproduction is needed to help address that quickly. I just wanted to share the stack here, maybe you know from looking at the stack what could cause the crash?

hannojg avatar Apr 22 '24 13:04 hannojg

This is interesting. There might be a bug on our side. I am reviewing a couple of possible flow where this error could happen (surfaceDestroyed being invoked before surfaceSizeChanged, which may not be impossible due to some RN-Screens extra handling we're doing there). if we were to throw an error there instead of segfaulting that would already be useful right?

If cannot find anything obvious, would it be possible to review the app code privately? I would like to check which APIs are used and how they are being used. Do you have anything exotic there?

wcandillon avatar Apr 22 '24 13:04 wcandillon

would it be possible to review the app code privately?

Yes, we'd appreciate that! I send you a message on slack, thanks @wcandillon

We are not doing anything crazy, mostly just using the declarative react API.

hannojg avatar Apr 22 '24 14:04 hannojg

same here

"react-native": "0.72.5",
"@shopify/react-native-skia": "^1.2.3"

duhuajin avatar May 22 '24 09:05 duhuajin

StressTest.zip (cc @Nodonisko who's might interesting about this whole issue).

Above is a sample that I used stress test the Skia surfaces on Android but couldn't reproduce the error even under very harsh conditions. I'm sharing this because maybe you can play with it and find a way to reproduce the error.

That being said, the first error reported seems to be that surfaceSizeChanged can be invoked after surfaceDestroyed is invoked. The first thing we can do (and will do) is to simply check for it in surfaceSizeChanged. That should fix the first error.

How why does this error happens in the first place? I only have a vague sense of it. We have complexified the surface lifecycle logic in order to have a fast first time to frame and still supporting react-native-screens. We always had in mind to simplify the logic there and this bug report is probably a good call to do so. In newer API levels, we are able to support all of our use-cases without any complex logic (see #2128 and also surface control on API level 29).

I suggest the following:

  • [ ] Ship the nullptr check onsurfaceSizeChanged
  • [ ] @hannojg @Nodonisko Would it be possible to for you to ship a custom version of RN Skia (with a simplified android renderer) and find out of the error happens there? If yes we would know what do.
  • [ ] Investigated the second error that was mentioned in this issue (which does seem related).
  • [ ] There are a lot of opportunities to make our Android renderer leaner and better and now that I am seeing these issues, it looks like this is a good time to do it.

Let me know your thoughts on this.

wcandillon avatar Jun 10 '24 07:06 wcandillon

@wcandillon I will try to run that stress components you provided and see if I will be able to crash that.

We can probably ship special version of Skia to test it, only problem is that right now we are on min SDK 28. I guess we can go to 29 but we need to discuss it.

Nodonisko avatar Jun 10 '24 09:06 Nodonisko

min sdk 29 would only be for a specific use-case, we could discuss it, see what's the best plan of action is.

wcandillon avatar Jun 10 '24 09:06 wcandillon

Sure, let's do it!

Nodonisko avatar Jun 10 '24 10:06 Nodonisko

@wcandillon We've rolled out version 1.3.6 to our users and the issue still persists:

Screenshot 2024-06-24 at 10 12 30

hannojg avatar Jun 24 '24 08:06 hannojg

@hannojg @Nodonisko I believe to have found a path to a race condition where this error could happen. I will provide a test version for it. However, like for #2481, I am unable to reproduce the error myself (but we did fix the error there for instance).

@Nodonisko your finding that this is related to navigation screen was very helpful. I wrote the following stress test to try to reproduce the crash (unsuccessfully but maybe you have more insights into this):


export const IconsExample = () => {
  const assets = useLoadSVGs();

  const navigation = useNavigation();

  useEffect(() => {
    const interval = setInterval(() => {
      navigation.navigate("IconHome");
      setTimeout(() => {
        navigation.navigate("Settings");
      }, 10);
    }, 50);
    return () => clearInterval(interval);
  }, [navigation]);
  if (!assets) {
    return null;
  }
  return (
    <SVGContext.Provider value={assets}>
      <Tab.Navigator>
        <Tab.Screen name="IconHome" component={HomeScreen} />
        <Tab.Screen name="Settings" component={SettingsScreen} />
      </Tab.Navigator>
    </SVGContext.Provider>
  );
};

wcandillon avatar Jun 28 '24 06:06 wcandillon

@wcandillon That's very similar code I used, but I was very little successful too with reproducing crash.

Nodonisko avatar Jul 01 '24 08:07 Nodonisko

Since you're trying to tackle this problem, I would just like to add information I have for it:

  • Since we use react-native-skia we've had this issue in our app (reported by Sentry)
  • We never had a user reporting a crash for this page even though it is central to our app
  • Neither did we ever saw a crash after years of manual testing
  • Reported by sentry very unfrequently
  • Happens to plethora of android devices though
  • We don't use any setInterval shenanigans!

Hope this helps! Thank you for trying to fix this!

TwistedMinda avatar Jul 03 '24 15:07 TwistedMinda

This seems to be fixed for us in the 1.3.7 version 🎊 🎉 Thanks William!

hannojg avatar Jul 15 '24 07:07 hannojg