plugins
plugins copied to clipboard
[google_maps] delay mapView destruction by one frame to prevent NullpointerException
This PR fixes an issue where the google_maps_flutter packages throws a platform error, which causes the app to crash.
Most of the explanation below can also be read here, as well as the discussions around it: https://github.com/flutter/flutter/issues/105965
This is the error that is thrown:
E/AndroidRuntime( 5262): FATAL EXCEPTION: GLThread 7073
E/AndroidRuntime( 5262): Process: org.vinitor.app, PID: 5262
E/AndroidRuntime( 5262): java.lang.NullPointerException: Attempt to get length of null array
E/AndroidRuntime( 5262): at java.nio.ByteBufferAsIntBuffer.put(ByteBufferAsIntBuffer.java:122)
E/AndroidRuntime( 5262): at com.google.maps.api.android.lib6.gmm6.vector.gl.buffer.n.i(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (190800-0):2)
E/AndroidRuntime( 5262): at com.google.maps.api.android.lib6.gmm6.vector.gl.buffer.n.d(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (190800-0):3)
E/AndroidRuntime( 5262): at com.google.maps.api.android.lib6.gmm6.vector.gl.drawable.y.s(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (190800-0):16)
E/AndroidRuntime( 5262): at com.google.maps.api.android.lib6.gmm6.vector.gl.drawable.ao.s(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (190800-0):11)
E/AndroidRuntime( 5262): at com.google.maps.api.android.lib6.gmm6.vector.bz.s(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (190800-0):29)
E/AndroidRuntime( 5262): at com.google.maps.api.android.lib6.gmm6.vector.bs.b(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (190800-0):151)
E/AndroidRuntime( 5262): at com.google.maps.api.android.lib6.gmm6.vector.av.run(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (190800-0):48)
I/Process ( 5262): Sending signal. PID: 5262 SIG: 9
It seems that this problem is caused by disposing the GoogleMap Widget to quickly after creating it. This can be the case, if the widget is inside a tab and tabs are quickly switched or inside a route where the route is popped quickly after navigation.
I have created a repo to demonstrate and reproduce this issue here: https://github.com/lukaskurz/gmaps_crash_demo Using the example app, I noticed that the bug is related to the initalPosition, namely if the initialPosition has zoomLevel set, which (i assume) requires a camera animation after creating the widget. Tab 1 and 2 have a zoomLevel set, while 3 and 4 have not. One can observe that switching between 3-4 quickly does not cause any issues, while 1 and 2 will trigger the bug.
https://user-images.githubusercontent.com/22956519/178498796-64469b74-816f-424c-8e5c-571e6f789449.mp4
After doing some more debugging (i.e. putting hundreds of Log.d calls everywhere in the code), I figured out that removing the mapView.onDestroy() stops the issue from happening.
private void destroyMapViewIfNecessary() {
Log.d("ShortyDebug", "destroyMapViewIfNecessary");
if (mapView == null) {
return;
}
//mapView.onDestroy();
mapView = null;
Log.d("ShortyDebug", "destroyMapViewIfNecessary done");
}
Which makes sense, since the bug is likely triggered by the GL renderer accessing an array, which has already been garbage collected, so by not triggering the GC for the mapView, the bug does not happen. This is hardly a fix though, since we now probably have a memory leak on our hands.
A comment in the code, mentioned by @martin-braun, talks about using a 2 frame delay for some animation related stuff, which got me the idea of delaying the onDestroy by 1 frame, which as it turns out fixes the bug. Since the mapView.onDestroy() leads into the proprietary part of the code, which I cannot see or read, I can now only speculate why and how this fixes the problem:
The GL renderer is likely still accessing the view and its variable, not knowing that view is on it's way to be marked for GC (since this likely happens a bit further down the codepath). It triggers a camera animation (or something similar) for the initialPosition, which will not throw any error or be cancelled immediately, because the view is still not marked for GC (but will be a few steps later). But if we mark everything as disposed, while delaying the actual onDestroy of the view by 1 frame, then during the next frame calculation, the animation code in the rendered can see that it should cancel the animation, since the widget is being disposed and only after that cancelling do we actually destroy the view.
Atleast that's how I imagined/pieced this together in my head, with the little understanding about engine rendering loops I have from some old game engine experience. I don't actually know what happens here and the background, but the fix seems to work, so I guess there might be something to my theory :sweat_smile:
TLDR; Quickly pushing and popping pages containing a GoogleMaps Widget, can trigger a NullpointerException in it's rendering code causing the app to crash. This is likely caused by part of the disposal code eagerly destroying stuff, while part of the initialization is still running and using now GCed variables. The fix delays part of the disposal code by 1 frame, giving the initialization enough time to realize stuff is no longer available and it has to stop.
Fixes https://github.com/flutter/flutter/issues/105965 There also plenty of issues regarding this on the google maps issue tracker (mentioned in above issue). So there is likely a problem inside Google's code, which they have to fix themselves.
Pre-launch Checklist
- [X] I read the Contributor Guide and followed the process outlined there for submitting PRs.
- [X] I read the Tree Hygiene wiki page, which explains my responsibilities.
- [X] I read and followed the relevant style guides and ran the auto-formatter. (Unlike the flutter/flutter repo, the flutter/plugins repo does use
dart format.) - [X] I signed the CLA.
- [X] The title of the PR starts with the name of the plugin surrounded by square brackets, e.g.
[shared_preferences] - [X] I listed at least one issue that this PR fixes in the description above.
- [X] I updated
pubspec.yamlwith an appropriate new version according to the pub versioning philosophy, or this PR is exempt from version changes. - [X] I updated
CHANGELOG.mdto add a description of the change, following repository CHANGELOG style. - [X] I updated/added relevant documentation (doc comments with
///). - [X] I added new tests to check the change I am making, or this PR is test-exempt.
- [X] All existing and new tests are passing.
If you need help, consider asking for advice on the #hackers-new channel on Discord.
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!).
If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?
Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.
@lukaskurz It wants a proper formatting of the file. This command should fix it:
patch -p1 <<DONE
diff --git a/packages/google_maps_flutter/google_maps_flutter/android/src/main/java/io/flutter/plugins/googlemaps/GoogleMapController.java b/packages/google_maps_flutter/google_maps_flutter/android/src/main/java/io/flutter/plugins/googlemaps/GoogleMapController.java
index 559fc1281..a126edd11 100644
--- a/packages/google_maps_flutter/google_maps_flutter/android/src/main/java/io/flutter/plugins/googlemaps/GoogleMapController.java
+++ b/packages/google_maps_flutter/google_maps_flutter/android/src/main/java/io/flutter/plugins/googlemaps/GoogleMapController.java
@@ -889,10 +889,11 @@ final class GoogleMapController
}
// This fixes an issue where the mapView is still being used by the render thread after disposal.
// Delaying the actual mapView disposal to the next frame avoids the issue.
- postFrameCallback(() -> {
- mapView.onDestroy();
- mapView = null;
- });
+ postFrameCallback(
+ () -> {
+ mapView.onDestroy();
+ mapView = null;
+ });
}
public void setIndoorEnabled(boolean indoorEnabled) {
DONE
maybe it will also correct itself properly when you format the file in Android Studio. I think the default formatting follows Google's standard.
Then, the CHANGELOG.md needs to be updated as well and we are good to go (see below).
Thank you so much for doing this. 👍🏻
Per the comment I just left in the issue, we should figure out if using Hybrid Composition would avoid the need for this hack.
@stuartmorgan Maybe even Hybrid Composition is not needed if this issue is fixed in the onDispose implementation of the map view. However, this is a critical P4 issue, so we really need a hotfix at this point.
My personal opinion is this: Let's push this "hack" and revert it as soon as it's fixed in the native implementation. If this thing takes longer than expected, one might fork and republish this wrapper on pub.dev potentially locking projects in a custom legacy version of google_maps_flutter.
My app uses hybrid composition and I still see this error: AndroidGoogleMapsFlutter.useAndroidViewSurface = true;
My app uses hybrid composition and I still see this error: AndroidGoogleMapsFlutter.useAndroidViewSurface = true;
In Flutter 3.0 the docs aren't correct any more; the behavior of Android platform views changed and the plugin docs haven't been updated accordingly yet.
Maybe even Hybrid Composition is not needed if this issue is fixed in the
onDisposeimplementation of the map view.
This bug isn't why we're planning on switching the plugin to Hybrid Composition at the moment.
However, this is a critical P4 issue, so we really need a hotfix at this point.
I don't know what "hotfix" means in this context. Our plugins don't have branches or a channel system, and all changes are released immediately under our current release model.
My personal opinion is this: Let's push this "hack"
I don't know why you are putting "hack" in quotes here given that the PR description clearly states that the theory about what is happening here is completely speculative. We have, currently, no idea what the specific race is here, or whether this code will actually work (i.e., whether the race timing is actually tied to frames) or just happens to work in the author's manual tests so far.
Notably, it turns out that the code referred to here as having been the inspiration for this PR:
A comment in the code, mentioned by @martin-braun, talks about using a 2 frame delay for some animation related stuff, which got me the idea of delaying the onDestroy by 1 frame
does not work correctly. That hack turns out to only work on faster devices, and will need to be removed. It wasn't an actual solution.
and revert it as soon as it's fixed in the native implementation.
In what native implementation? We don't know what the cause of the bug is yet, or where exactly it's coming from. There's a sentence in the PR description that suggests that this is a bug in the maps SDK itself, but according to comments in the issue this only happens in Flutter 3.0.
If this thing takes longer than expected, one might fork and republish this wrapper on pub.dev potentially locking projects in a custom legacy version of
google_maps_flutter.
It is always the case that someone might fork a plugin, and this argument can be applied to literally every PR that's ever submitted. It's not a reason for landing code that doesn't yet meet our standards.
It's not a reason for landing code that doesn't yet meet our standards.
Does this mean you don't see this PR being merged to address the issue ? I can understand your viewpoint, that this change is based on speculation and thus is not guaranteed to work 100% of the time. Even if it were a 100% fix, there's no way to know (atleast for us non-google engineers), since we can't further investigate than the SDK, right ?. Given the severity of this bug however, how would recommend to move forward here ?
If this really is a SDK issue (which they don't think it is), there's nothing we can really do, other than "bandaid" it until they fix it.
Does this mean you don't see this PR being merged to address the issue ?
Not without more investigation so that we more clearly understand the cause. Certainly not without a reasonable explanation of why this is specific to Flutter 3.0, for instance, since answering that question would likely give us a lot more information about what the problem actually is.
Also not without understanding how it will interact with HC, since as mentioned above we are currently planning on switching the plugin to HC shortly, which apparently completely changes the problem.
If this really is a SDK issue (which they don't think it is), there's nothing we can really do, other than "bandaid" it until they fix it.
If we actually determine that, then a workaround would be reasonable, but we haven't.
Also not without understanding how it will interact with HC, since as mentioned above we are currently planning on switching the plugin to HC shortly, which apparently completely changes the problem.
Well I could at least test it using the demo app and see what happens.
In Flutter 3.0 the docs aren't correct any more; the behavior of Android platform views changed and the plugin docs haven't been updated accordingly yet.
This means I probably used the Hybrid Composition wrong, when I mentioned it in the issue. Are the docs already updated or could you tell me how I can use HC in Flutter 3.0, so I can test it in the demo app ? Thanks :+1:
could you tell me how I can use HC in Flutter 3.0
This comment (which your reply there says you were doing in your test) does that. Specifically, the change to initExpensiveAndroidView is the only way to actually use HC in Flutter 3.0.
There's a sentence in the PR description that suggests that this is a bug in the maps SDK itself, but according to comments in the issue this only happens in Flutter 3.0.
This issue on Google is definitely the same issue as well, but we cannot know if Flutter was used by the person who reports. However, React Native is affected as well, so either both frameworks fall into the same pitfall, or the issue is indeed in the underlying SDK.
I don't know why you are putting "hack" in quotes here
I don't mean to imply that it definitely is something else than a hack, I do imply that I do not know if this can be considered a hack, so I put it in quotes to borrow your term. Getting hung up on terminology makes little sense from my point of view right now.
In regards of the rest of your statement, I'd like to accept that. These are fair and valid points, given the wide range of devices this code needs to work on.
I enabled this workaround and now get a new error. This happens in the background when exiting the app, so the user does not see it. However, it shows up in crashlytics:
Fatal Exception: java.lang.NullPointerException: Attempt to invoke virtual method 'void f2.d.c()' on a null object reference at io.flutter.plugins.googlemaps.GoogleMapController.lambda$destroyMapViewIfNecessary$0(GoogleMapController.java:2) at io.flutter.plugins.googlemaps.GoogleMapController.$r8$lambda$z5trzuhb4pJJGItyEPmCMiQOtYA(GoogleMapController.java) at io.flutter.plugins.googlemaps.GoogleMapController$$InternalSyntheticLambda$0$c5f87306e4672703bfe381ea0098f633d73403dda6cabbd160a60da2683611a8$0.run(GoogleMapController.java:2) at io.flutter.plugins.googlemaps.GoogleMapController$2.doFrame(GoogleMapController.java:2) at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1035) at android.view.Choreographer.doCallbacks(Choreographer.java:845) at android.view.Choreographer.doFrame(Choreographer.java:775) at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1022) at android.os.Handler.handleCallback(Handler.java:938) at android.os.Handler.dispatchMessage(Handler.java:99) at android.os.Looper.loopOnce(Looper.java:201) at android.os.Looper.loop(Looper.java:288) at android.app.ActivityThread.main(ActivityThread.java:7842) at java.lang.reflect.Method.invoke(Method.java) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1003)
In what native implementation? We don't know what the cause of the bug is yet, or where exactly it's coming from. There's a sentence in the PR description that suggests that this is a bug in the maps SDK itself, but according to comments in the issue this only happens in Flutter 3.0.
It has been confirmed that were are dealing with a bug within the native implementation indeed.
It has been confirmed that were are dealing with a bug within the native implementation indeed.
I've followed up on this issue internally, and that's not actually the case; the verification is that it was reproducible, not that it's in Google Maps itself. It may well be an issue in the platform view implementation in Flutter (which would make sense given the timing).
@stuartmorgan are you sure they testing their component with Flutter? That would make no sense at all. Follow ups on the issue indicate they have over-worked the renderer and thus it is fixed in the trunk, but the new renderer logic is not production-ready.
It looks at this point like there are probably two issues which result in very similar stack traces (as both are attempting to use the map after it's been destroyed):
- An ordering bug with some async calls in the Android platform view implementation, which we're investigating; if that's the case, this would explain why there's a spike of issues in 3.0.
- An internal maps threading issue in the (non-new) renderer.
We can proceed with a delayed destruction workaround for the latter, along the lines of what this PR is doing. However, I don't think using postFrameCallback actually makes sense, because the Maps internal rendering thread and the Flutter rendering thread are unrelated, so delaying by a Flutter frame isn't meaningful. It's not that it works because delaying one frame solves the problem, but that it sometimes works because it's delaying some amount of time, and sometimes that's enough. (I.e., exactly the same problem that the code this is based on also has.) Instead we should explicitly delay some small amount of time that we pick, not whatever the Flutter frame time happens to be, and clearly document that it's a workaround.
This will also need a native unit test with a test-only accessor that validates that the maps destruction doesn't happen synchronously, but then does happen after a short delay.
@lukaskurz Are you planning on updating this PR per the comment above?
@lukaskurz Are you planning on updating this PR per the comment above?
I was (still am somewhat) held up by some other stuff, but I would like to see this fixed, since I can still see some related alerts (albeit less) in my sentry dashboard. So maybe I can make some time, to finally get rid of this issue.
So the changes needed are a fixed amount time-delay and that unit test right ? First time writing a native unit test for android, so I might have to read up on that, before I can write that.
So the changes needed are a fixed amount time-delay and that unit test right ?
We will need both of those, but also after looking into this a bit more we will also want to gate this on the renderer being used, so that as the new renderer rolls out we aren't applying it in cases where it's not needed.
It looks like we can query it by doing https://developers.google.com/maps/documentation/android-sdk/renderer#how_to_try_the_new_renderer with null as the preferredRender (per the API docs) so we get the callback about what mode is being used.
@stuartmorgan As you requested, I added a check to ensure this workaround is only used with Renderer LEGACY and not with LATEST. I changed the workaround to a use a set delay of 100ms. Kind of arbitrary, but I thought this length of time would give us 1-2 frames delay at a framerate of 20fps, so that even for the slower devices/apps this should work (unless framerate drops to 10fps or some jank happens).
I also added some unit tests to ensure the intended workaround behaviour. I had to make some changes and extract some variables so that they can be mocked, otherwise the unit tests would not work.
@lukaskurz thanks for your work on this. I am using your fix in my app and it worked until Flutter version 3.0.5. However, since Flutter version 3.3.0 my App crashed again. So maybe the fix doesn't work in this version or there is a new bug with a similar behaviour. I also tried the fix together with the gmaps_crash_demo you provided and version 3.3.0 of flutter and was able to trigger a crash there as well (randomly switching between the tabs).
This is the error that was thrown:
E/AndroidRuntime(6544): FATAL EXCEPTION: GLThread 40315
E/AndroidRuntime( 6544): Process: com.example.gmaps_crash_demo, PID: 6544
E/AndroidRuntime( 6544): java.lang.NullPointerException: Attempt to get length of null array
E/AndroidRuntime( 6544): at java.nio.ByteBufferAsIntBuffer.put(ByteBufferAsIntBuffer.java:122)
E/AndroidRuntime( 6544): at com.google.maps.api.android.lib6.gmm6.vector.gl.buffer.n.i(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (040408-0):2)
E/AndroidRuntime( 6544): at com.google.maps.api.android.lib6.gmm6.vector.gl.buffer.n.d(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (040408-0):3)
E/AndroidRuntime( 6544): at com.google.maps.api.android.lib6.gmm6.vector.gl.drawable.d.s(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (040408-0):2)
E/AndroidRuntime( 6544): at com.google.maps.api.android.lib6.gmm6.vector.gl.drawable.ao.s(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (040408-0):12)
E/AndroidRuntime( 6544): at com.google.maps.api.android.lib6.gmm6.vector.bz.s(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (040408-0):29)
E/AndroidRuntime( 6544): at com.google.maps.api.android.lib6.gmm6.vector.bs.b(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (040408-0):151)
E/AndroidRuntime( 6544): at com.google.maps.api.android.lib6.gmm6.vector.an.run(:com.google.android.gms.dynamite_mapsdynamite@[email protected] (040408-0):48)
As said, I don't know if this is due to changes in flutter or maybe a new bug with similar behavior. If there is anything more I can do to help, please let me know.
I have similar issue. Is there any workaround? After upgrading to Flutter 3 this issue appears now from time to time. We are blocked with release through this issue
@ldemyanenko I switched away from the old renderer as proposed in https://github.com/flutter/flutter/issues/105965#issuecomment-1224473127 and have not experienced a crash yet with Flutter 3.3
I am going to close the PR for now, since the old fix no longer works on the new flutter version and I don't have the time to finish developing this.