android icon indicating copy to clipboard operation
android copied to clipboard

[Marking Mode: None]: Crash: Attempt to use cleared object reference

Open daveware-nv opened this issue 7 years ago • 61 comments
trafficstars

Please, provide the details below:

Via crashlytics we have been getting reports of crashes of the type com.tns.NativeScriptException: Attempt to use cleared object reference id=48048(com.tns.NativeScriptException: Attempt to use cleared object reference id=48048 com.tns.Runtime.getJavaObjectByID(Runtime.java:893)

Did you verify this is a real problem by searching Stack Overflow and the other open issues in this repo?

Other closed bugs reference the same or similar issues (latest was closed for 3.0.0).

Please provide the following version numbers that your issue occurs with:

  • CLI: 3.2.1
  • Cross-platform modules: 3.2.0
  • Runtime(s): 3.2.0
  • Plugin(s):
    • nativescript-angular: 4.4.1
    • nativescript-fabric: 0.3.1
    • nativescript-pro-ui: 3.2.0
    • nativescript-google-maps-sdk: 2.3.3

Built using webpack (npm run ns-bundle --android --build-app --snapshot --clean --release) Also using in app/package.json:

  "android": {
    "v8Flags": "--expose_gc",
    "markingMode": "none"
  }

Did the error happen while the app was being constructed? (buildtime error)

No

Did the error happen while the app was executing? (runtime error)

Yes, but I haven't reproduced it myself. Just numerous similar stack traces being sent via crashlytics/fabric.

e.g.

#0. Crashed: main: 0 0 0x0000000000000000
       at .Error: com.tns.NativeScriptException: Attempt to use cleared object reference id=48048(com.tns.NativeScriptException: Attempt to use cleared object reference id=48048
    com.tns.Runtime.getJavaObjectByID(Runtime.java:893)
    com.tns.Runtime.callJSMethodNative(Native Method)
    com.tns.Runtime.dispatchCallJSMethodNative(Runtime.java:1084)
    com.tns.Runtime.callJSMethodImpl(Runtime.java:966)
    com.tns.Runtime.callJSMethod(Runtime.java:953)
    com.tns.Runtime.callJSMethod(Runtime.java:937)
    com.tns.Runtime.callJSMethod(Runtime.java:929)
    com.tns.gen.java.lang.Runnable.run(Runnable.java:10)
    android.os.Handler.handleCallback(Handler.java:751)
    android.os.Handler.dispatchMessage(Handler.java:95)
    android.os.Looper.loop(Looper.java:154)
    android.app.ActivityThread.main(ActivityThread.java:6780)
    java.lang.reflect.Method.invoke(Native Method)
    com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1496)
    com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1386):1)
       at . ... (repeats with slight differences)

full trace at https://pastebin.com/JEGTYrpx

Please tell us how to recreate the issue in as much detail as possible.

Unknown. Possibly random.

daveware-nv avatar Nov 28 '17 21:11 daveware-nv

Hi @daveware-nv, we're going to be able to better assist you if you provide steps to reproduce, with the minimal code required or a small repo for us to check out. Having said that, can you give us the content of your projects' package.json so we can check out the plugins you're using. The problem occurs when a javascript object with a native counterpart isn't taken care of by the developer, since marking mode=none doesn't check which js objects have a native counterpart and should survive GC and which don't.

Plamen5kov avatar Nov 29 '17 12:11 Plamen5kov

As mentioned I don't know when it occurs as I haven't observed it myself, nor as a result do I know what the minimal code required to reproduce would be.

snippets from the package.json:

  "dependencies": {
    "nativescript-angular": "~4.4.1",
    "nativescript-angular-snapshot": "3.0.0-5.5.372.32",
    "nativescript-couchbase": "custom-see-below",
    "nativescript-fabric": "^0.3.0",
    "nativescript-google-maps-sdk": "^2.3.2",
    "nativescript-pro-ui": "~3.2.0",
    "nativescript-svg": "^1.1.3",
    "nativescript-theme-core": "~1.0.2"
},
"devDependencies": {
    "nativescript-css-loader": "~0.26.0",
    "nativescript-dev-android-snapshot": "0.*.*",
    "nativescript-dev-typescript": "~0.4.0",
    "nativescript-dev-webpack": "^0.8.0",
    "nativescript-worker-loader": "~0.8.1",
}

notes:

  • Despite nativescript-svg being present I don't think we're actually actively using it.
  • For nativescript-couchbase we're using a custom version derived from 1.0.17
  • We also have a purely custom plugin for doing some bluetooth interaction
  • There's probably also direct references to android bits here and there in the application itself, including producing some uuid's using java.util.UUID.randomUUID().toString().

Is there some detailed documentation on what 'taken care of by the developer' means in you above post? The only reference I've seen to markingMode: none is the 3.2 announcement blog post and it wasn't detailed enough on what is required to prevent markingMode: none causing problems.

We have a lot of data going into and out of couchbase, so depending on what care needs to be taken with which kinds of java objects that may be the first place for me to look once I know what to look for.

daveware-nv avatar Nov 29 '17 20:11 daveware-nv

@daveware-nv

...once I know what to look for

The simplest way for me to explain what to look for. Consider this scenario in your code or in some native plugin you're using (there's a big chance the problem is in the plugin, as the currently available plugins aren't meant to be used in conjunction with markingMode: none.

problem

var implementor = new android.native.Implementor()
var callback = new android.native.NCallback({
	getMessage: function() {
		implementor.getMessage()
	}
})
android.native.Executor.printWithDelay(callback, 3s)

pseudo code

  • android.native.Implementor : native class
  • android.native.NCallback : native interface

The implementor is enclosed by the callback implementation, but with markingMode: none enabled we no longer take care of finding that connection out. So when GC happens implementor instance is GC`ed.

solution to fix the previous example you must keep implementor from being GC'ed, by attaching it to the global object which is not being GC'ed ever, or some other long-lasting object, exported module or wherever you see fit. (implementor is attached to global is just for the sake of the example).

var implementor = new native.Implementor()
global.implementor = implementor
var callback = new native.NCallback({
	getMessage: function() {
              global.implementor.getMessage()
	}
})
native.Executor.printWithDelay(callback, 3s)

pseudo code

summary Every native object instantiated in JS: var implementor = new android.native.Implementor() which is enclosed by a function of an extended class, or enclosed up it's prototype chain:

...
	getMessage: function() {
		implementor.getMessage()
	}
...

needs to be managed by the developer. You should decide when it should be not used anymore and can be let go for GC.

Plamen5kov avatar Nov 30 '17 11:11 Plamen5kov

We haven't got a blog post about it because it's an experimental flag, and we're still contemplating on trying it out. We wanted to check out what kinds of problems developers would encounter, so we get a better understanding of the user scenarios.

Plamen5kov avatar Nov 30 '17 11:11 Plamen5kov

Thanks. If that is the extent of it, then it should be less of a problem to track things down in my code than I expected. Less easy are nativescript-pro-ui since it does not have source available. And nativescript-angular since my app is an angular one this will be used heavily by every screen (and a quick look though the stacktrace I posted above reveals EmulatedRenderer.NativeScriptRenderer.insertBefore, which I believe corresponds to this line https://github.com/NativeScript/nativescript-angular/blob/42e438ce4b206ff96fb2fe31e9968fcfc555f5a6/nativescript-angular/renderer.ts#L101)

Unfortunately markingMode: none makes a huge difference. Otherwise I would have never deployed with it in the first place, and could simply turn it off now. The random multi-second (and sometimes minutes) long UI freezes did not make for a good user experience.

daveware-nv avatar Nov 30 '17 20:11 daveware-nv

@daveware-nv, did you manage to fix your issue or are you still trying to figure out where the problem is?

Plamen5kov avatar Dec 04 '17 16:12 Plamen5kov

Unfortunately I haven't had time to complete my sweep yet. And I probably wont know for sure whether any changes made fixed everything until we make another release, since the crashing hasn't really shown up during test usage.

I have established some candidates to clean up however:

  • There's one that looks especially fishy in the couchbase plugin.
  • I've got some nativescript -> java interface implementations in my bluetooth plugin. But it doesn't reference java objects that aren't passed as parameters (are JS objects fine here? I'd assume at he very least the error would be different).
  • There's also some java objects being returned from the bluetooth plugin that I'll have to trace the path of more thoroughly.
  • I still have yet to look at the google maps plugin.

And of course then there is nativescript-angular and nativescript-pro-ui, which I wouldn't know where to start looking, do you know if these have been vetted similar to tns-core-modules?

daveware-nv avatar Dec 04 '17 20:12 daveware-nv

And of course then there is nativescript-angular and nativescript-pro-ui, which I wouldn't know where to start looking,

Both plugins are usable with the current version of the android runtime, but they are not compatible with markingMode: none at least not yet.

I've got some nativescript -> java interface implementations in my bluetooth plugin. But it doesn't reference java objects that aren't passed as parameters (are JS objects fine here? I'd assume at he very least the error would be different).

It doesn't really matter if the javascript objects with the native counterparts are passed as a parameter or not, except for their life duration. Unfortunately, the error you saw, will be the same no matter if the parameters are passed as arguments to a function or not.

Plamen5kov avatar Dec 05 '17 16:12 Plamen5kov

Both plugins are usable with the current version of the android runtime, but they are not compatible with markingMode: none at least not yet.

That is unfortunate, that probably makes my efforts (and yours) here moot for the time being. So I'll probably have to advise the project owner that markingMode: none isn't supported for our applications dependencies, and see if they want to persist here or not. Do you know what the timeframe (if any) for supporting markingMode: none for those plugins is?

It doesn't really matter if the javascript objects with the native counterparts are passed as a parameter or not, except for their life duration. Unfortunately, the error you saw, will be the same no matter if the parameters are passed as arguments to a function or not.

I was mostly implying that since they are passed as parameters there must be a reference to them somewhere else (not just the method where I'm defining my interface implementation) on the stack, and hence where responsibility for managing the references to the native object are. The other thing I was asking was whether pure javascript objects can also get GC'd early across this boundary?

daveware-nv avatar Dec 05 '17 21:12 daveware-nv

Do you know what the timeframe (if any) for supporting markingMode: none for those plugins is?

We've not settled on using markingMode:none by default yet, because we're still trying out GC algorythms that wouldn't ask the developer to take care of the objects' lifetime and still be more performant than what we currently have. Unfortunately I can't give you a timeframe, because it's a very fundamental matter and needs a measure of trial and error.

The other thing I was asking was whether pure javascript objects can also get GC'd early across this boundary?

Pure js objects are dealt with entirely by V8's GC. If V8 decides it's time to GC an object, it does usually because it can't reach it in the object graph. I haven't seen cases where the pure js object is being collected early, thus causing errors. This happens only for js objects that have native counterparts like var nativeButtonInstance = new android.widget.Button() because, in the ideal scenario, the runtime needs to take care of their garbage collection.

Plamen5kov avatar Dec 06 '17 07:12 Plamen5kov

I am getting a similar error on random occasions.

Setup:

"dependencies": { "nativescript": "^3.4.0", "nativescript-angular": "^5.0.0", "nativescript-drop-down": "^3.2.0", "nativescript-local-notifications": "^2.0.2", "nativescript-plugin-firebase": "^5.0.5", "nativescript-pro-ui": "^3.2.0", "nativescript-theme-core": "~1.0.2", "nativescript-toast": "^1.4.6", "reflect-metadata": "~0.1.8", "rxjs": "^5.5.6", "tns-core-modules": "^3.4.0", "zone.js": "^0.8.18" }, "devDependencies": { "babel-traverse": "^6.26.0", "babel-types": "^6.26.0", "babylon": "^6.18.0", "lazy": "^1.0.11", "nativescript-dev-typescript": "^0.6.0", "typescript": "^2.6.2" }

Error:

com.tns.NativeScriptException: Attempt to use cleared object reference id=2263 at com.tns.Runtime.getJavaObjectByID(runtime.java:897) at com.tns.Runtime.callJSMethodNative(Native Method) at com.tns.Runtime.dispatchCallJSMethodNative(runtime.java:1088) at com.tns.Runtime.callJSMethodImpl(runtime.java:970) at com.tns.Runtime.callJSMethod(runtime.java:957) at com.tns.Runtime.callJSMethod(runtime.java:941) at com.tns.Runtime.callJSMethod(runtime.java:933) at com.tns.gen.com.telerik.widget.list.ListViewAdapter_frnal_ts_helpers_l58_c38__ListViewAdapter.onCreateViewHolder(com.telerik.widget.list.listviewadapter.java) at com.telerik.widget.list.ListViewWrapperAdapter.onCreateViewHolder(listviewwrapperadapter.java:459) at com.telerik.widget.list.ListViewWrapperAdapter.onCreateViewHolder(listviewwrapperadapter.java:20) ... at org.nativescript.widgets.CommonLayoutParams.measureChild(commonlayoutparams.java:262) at org.nativescript.widgets.MeasureHelper.measureChildFixedColumnsAndRows(gridlayout.java:1055) at org.nativescript.widgets.MeasureHelper.measure(gridlayout.java:865) at org.nativescript.widgets.GridLayout.onMeasure(gridlayout.java:279) at android.view.View.measure(view.java:21121) at org.nativescript.widgets.CommonLayoutParams.measureChild(commonlayoutparams.java:262) at org.nativescript.widgets.ContentLayout.onMeasure(contentlayout.java:32)

yyankov avatar Dec 29 '17 23:12 yyankov

Hi @yyankov thank you for your feedback. As I previously explained this error is caused by the fact your business logic encloses a native object, that's already collected. To put it more simply, you need to find the js object that's cleared and persist it in javascript in some of the ways I mentioned previously.

When I look at the error you posted, I can point you in the right direction: ListViewAdapter.onCreate it's probably an object enclosed by this function, which has a javascript counterpart.

Plamen5kov avatar Jan 04 '18 11:01 Plamen5kov

Closing due to lack of activity.

Plamen5kov avatar Mar 05 '18 13:03 Plamen5kov

@Plamen5kov - I'm not sure this should be closed; as the issue still exists.

And by the way, is in the onCreateViewHolder of the ListViewWrapperAdapter, and that actually is in the Nativescript-Pro-UI --- not his code... So in this case the error is 100% a Telerik/Progress issue; not something he can easily fix...

NathanaelA avatar Mar 05 '18 16:03 NathanaelA

@NathanaelA It is also (most likely) due to "markingMode": "none", which is not really a supported feature. So I would say whether or not this should be left open should be related to what the future plans are regarding "markingMode": "none"

daveware-nv avatar Mar 05 '18 19:03 daveware-nv

@daveware-nv -- Well, I tend to think it should stay open as long as the bug exists. Just because markingmode:"none" causes it to happen more often doesn't mean that the normal GC doesn't cause it to happen in other timing. I still get reports of random GC caused crashes; it isn't happening as much as it was; but once a month for a low use app is still too much.

In addition since markingmode:none hasn't been depreciated (but still in consideration(?) for a solid release); then this bug has should at a minimum stay open until it is fixed or markingmode:none is removed from the engine.

NathanaelA avatar Mar 05 '18 20:03 NathanaelA

@NathanaelA Just because markingmode:"none" causes it to happen more often doesn't mean that the normal GC doesn't cause it to happen in other timing.

That's incorrect. When marking mode is set, it changes the way GC works, so any related GC issues will be completely different.


And by the way, is in the onCreateViewHolder of the ListViewWrapperAdapter, and that actually is in the Nativescript-Pro-UI --- not his code... So in this case the error is 100% a Telerik/Progress issue; not something he can easily fix...

It's really unfortunate that you found the problem in nativescript-pro-ui, if that's really where the problem resides, but we haven't planned to make the plugins compatible with markingMode: "none", at least not yet since it's only an experimental feature, and it should be respected as such as @daveware-nv already mentioned.


In addition since markingmode:none hasn't been depreciated (but still in consideration(?) for a solid release); then this bug has should at a minimum stay open until it is fixed or markingmode:none is removed from the engine.

I closed the issue, just as I said "because of the lack in activity". We'd be happy to reopen it, should the need occur. For now it hasn't, because there's no new development on our side, nor on the current issue.


@NathanaelA If you figured out, something we haven't please share your knowledge or better yet do a PR, as you're knowledgeable enough to provide assistance to the plugin authors.

Plamen5kov avatar Mar 06 '18 09:03 Plamen5kov

@NathanaelA

Just because markingmode:"none" causes it to happen more often doesn't mean that the normal GC doesn't cause it to happen in other timing.

That's incorrect. When marking mode is set, it changes the way GC works, so any related GC issues will be completely different.

Actually I would tend to disagree with you; I do realize the M:N totally changes the way the GC works; but if something is being collected too soon under M:N; that doesn't mean under the normal GC the issue retaining the value may also still exists and the normal GC might collect it too soon also under different memory loads. If their isn't a proper ref to hold it; then it can be GC'd under both engines. This was my only point; without knowing why M:N collected it too soon, doesn't mean that the normal GC won't collect it too soon also. I still see reports weekly on the Attempt to use cleared object reference id in the normal GC engine; so unfortunately there are still premature GC collection issues in the normal GC engine also...


I closed the issue, just as I said "because of the lack in activity".

Closed to me says nothing is going to happen with it or invalid report.

See this is where I totally disagree with this stance, you are closing a VALID bug report w/o anything done on it. I know it looks good to management; but it is imho totally wrong stance. This issue isn't magically resolved by you closing it; and the bug is 100% valid issue UNTIL M:N is either depreciated and removed or the bug itself is fixed. Since M:N is still a valid experimental feature; the valid bugs that go with it should also be retained ;-)

NathanaelA avatar Mar 06 '18 21:03 NathanaelA

I'm reopening the issue as the problem is real and is not resolved. Our plans for after the 4.0 release are to focus on making plugins maintained by NativeScript compatible with markingMode: none. Cases like this one will be useful for us to validate whether we have covered all cases.

etabakov avatar Mar 19 '18 12:03 etabakov

Now that it's post 4.0, it would be really great news if you were working on the markingMode changes for the plug-ins. NS is an excellent system, well designed and with a great developer experience. Unfortunately the current choice between slowness (markingMode not set) and crashes (markingMode set to none) makes it difficult to ship on Android.

gsmedley avatar Apr 04 '18 10:04 gsmedley

Indeed, this is one of the things we will focus on for 4.1.

etabakov avatar Apr 04 '18 11:04 etabakov

Hi everybody, @gsmedley @etabakov @NathanaelA

There is a class of "Attempt to use cleared object reference" errors that is thrown out there, here is some insight on the design of markingMode: "none".

  1. JavaScript instances as long as they are alive (e. g. can not be GCed) will keep strong references to the Java instances.
  2. When JavaScript instance becomes eligible for collection, it will change the strong reference to a weak reference to the Java instance. From that point on the JavaScript instance will be "revived" as long as the Java instance is not collected.
  3. Only after the Java instance is collected, the JavaScript instance will be let to be collected.

This means that when markingMode: "none" is enabled the JavaScript instances will always live longer than the Java instances. Also as long as something holds the JavaScript instances alive (on stack, on global, on the module exports, or a function that is exported captures stuff in its closure etc.) the Java instances will survive.

So any error that happen, related with the Java instance being prematurely collected, should be due to JavaScript trying to call to Java. For example if you call method on an object, and that object was already collected, you should get a JavaScript callstack, and fixing errors having this callstack should be easy. The error should be similar to the following screenshot: image So once you get that, you as developer should pretty much figure out of the JavaScript callstack which instance was collected too early, go up the stuck (as much as possible) and retain somehow any listeners, or java objects that doesn't seem to have their JavaScript instances retained long enough. (we need proper article on this one)

On the other hand the error report for failed methods seem fine (you have JavaScript stack) but the:

#0. Crashed: main: 0 0 0x0000000000000000
       at .Error: com.tns.NativeScriptException: Attempt to use cleared object reference id=48048(com.tns.NativeScriptException: Attempt to use cleared object reference id=48048
    com.tns.Runtime.getJavaObjectByID(Runtime.java:893)

kind of errors are total no-go.

The {N} Android team should probably figure out where these Attempt to use cleared object reference are thrown: https://github.com/NativeScript/android-runtime/blob/f88bfe069fd07a7ae561978ab4d97737cfc01603/test-app/runtime/src/main/java/com/tns/Runtime.java#L897

Figure out this getJavaObjectByID is only used from C++: https://github.com/NativeScript/android-runtime/blob/7897e2dc5a4b0a819ce1ce549bb86d6d6e2c40e3/test-app/runtime/src/main/cpp/ObjectManager.cpp#L30

And that's why the Java callstack above seems lacking enough information. Following the "used by" chain you end up in NewWeakGlobalRefCallback then in:

https://github.com/NativeScript/android-runtime/blob/7897e2dc5a4b0a819ce1ce549bb86d6d6e2c40e3/test-app/runtime/src/main/cpp/ObjectManager.cpp#L25

The cache is initialized with this, and this cache on the other hand is used only in the ObjectManager GetJavaObjectByID:

https://github.com/NativeScript/android-runtime/blob/7897e2dc5a4b0a819ce1ce549bb86d6d6e2c40e3/test-app/runtime/src/main/cpp/ObjectManager.cpp#L134

Which is then used by GetJavaObjectByJsObject: https://github.com/NativeScript/android-runtime/blob/7897e2dc5a4b0a819ce1ce549bb86d6d6e2c40e3/test-app/runtime/src/main/cpp/ObjectManager.cpp#L79

GetJavaObjectByJsObject is called on places where the runtime is marshalling arguments for transition from JavaScript to Java. The errors for figuring out the "this" in a method call is well handled (like the screenshot above) but then, GetJavaObjectByJsObject is also used in:

https://github.com/NativeScript/android-runtime/search?q=GetJavaObjectByJsObject&type=Code

  • FieldAccessor.cpp
  • ArrayBufferHelper.cpp
  • ArrayElementAccessor.cpp
  • JsArgConverter.cpp
  • etc.

Every single place where it is used, we have isolate that is available and a JavaScript callstack should be available somewhere there, but the generated NativeScriptException seems to hide it.

With that said (if you choose to proceed with markingMode: none), the android runtime will have to do some advancements in the error reporting here. Since the errors will be random and hard to reproduce the error messages should be enough to locate and fix potential problems. These errors should be collectible by error reporting services (android crash reports? firebase?). Only when this is done we will be able to provide the "how to fix".

pana-cc avatar Apr 05 '18 10:04 pana-cc

The full log has javascript traces: StackLayout.ViewBase._addViewCore (<embedded>:57782:18) at Label.ViewBase._setupUI (<embedded>:57858:49)(com.tns.NativeScriptException: Attempt to use cleared object reference id=48048 at ViewUtil.insertToLayout (<embedded>:54934:20)(com.tns.NativeScriptException: Attempt to use cleared object reference id=48048

pana-cc avatar Apr 05 '18 13:04 pana-cc

There is a pattern: at . com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1386)(com.tns.NativeScriptException: Attempt to use cleared object reference id=48048 If you collect the messages between at . and (com.tns.NativeScriptException ... it looks like:

Error: com.tns.NativeScriptException: Attempt to use cleared object reference id=48048
    com.tns.Runtime.getJavaObjectByID(Runtime.java:893)
    com.tns.Runtime.callJSMethodNative(Native Method)
    com.tns.Runtime.dispatchCallJSMethodNative(Runtime.java:1084)
    com.tns.Runtime.callJSMethodImpl(Runtime.java:966)
    com.tns.Runtime.callJSMethod(Runtime.java:953)
    com.tns.Runtime.callJSMethod(Runtime.java:937)
    com.tns.Runtime.callJSMethod(Runtime.java:929)
    com.tns.gen.java.lang.Runnable.run(Runnable.java:10)
    android.os.Handler.handleCallback(Handler.java:751)
    android.os.Handler.dispatchMessage(Handler.java:95)
    android.os.Looper.loop(Looper.java:154)
    android.app.ActivityThread.main(ActivityThread.java:6780)
    java.lang.reflect.Method.invoke(Native Method)
    com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1496)
    com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1386)

    at Label.ViewBase._setupUI (<embedded>:57858:49)
    at StackLayout.ViewBase._addViewCore (<embedded>:57782:18)
    at StackLayout.ViewBase._addView (<embedded>:57760:14)
    at StackLayout.LayoutBaseCommon.addChild (<embedded>:65654:14)
    at ViewUtil.insertToLayout (<embedded>:54934:20)
    at ViewUtil.addToVisualTree (<embedded>:54915:18)
    at ViewUtil.insertChild (<embedded>:54883:18)
    at EmulatedRenderer.NativeScriptRenderer.insertBefore (file:///data/data/org.nativescript.redacted/files/app/bundle.js:45438:23)
    at execRenderNodeAction (<embedded>:9407:22)
    at visitRenderNode (<embedded>:9378:13)

pana-cc avatar Apr 05 '18 16:04 pana-cc

Longer stack traces?

pana-cc avatar Apr 06 '18 06:04 pana-cc

This is relevant, from the comments in the https://panayotcankov.github.io/nativescript-3.2.0-memory-management/

@gsmedley Thanks for this clear and detailed description of NS memory management. Do you have any advice for tracking a "can't find object with id" issue?

@pana-cc Are you using markingMode: "none" in app/package.json?

@gsmedley Yes. I don't think there is any choice really, due to the page navigation delays without it. I'm in a couple of discussions about this on github and I've seen that there is an effort to make plugins compatible with markingMode: "none" in 4.1. In the meantime, I would like to find and fix as many of these issues as possible myself, but the stack traces I get are not really useful. So, I was looking for advice on locating the sources of the problem. The issue I'm using to report this is: https://github.com/NativeScript/android-runtime/issues/986 Thanks

@pana-cc It is complicated, if we proceed with the execution, all errors should include piece of JavaScript callstack. The markingMode: none will keep all Java instances strong, as long as the JavaScript instances are not collectable by GC. After a JavaScript instance is collectable though, it will "revive" the JavaScript instance until the Java instance is collected. This ensures that transitions from Java to JavaScript are always possible, and crashes due to prematurely collected Java instances will always happen by executing piece of JavaScript. How do you collect your error traces? They don't have any JavaScript callstacks. I have attached an image with an example error that happens due to Java instance prematurely collected, you can se the JS trace. It is possible that there are paths that throw these "Attempt to use cleared object reference" that are not handled well, we will have to work on better error reporting for them before we can execute the markingMode: none. For example if there is any JavaScript callstack associated we should be able to display it too along the Java stack.

@gsmedley Thanks for this. I am getting better stack traces now, and I've posted one to the issue above. I haven't seen that particular stack trace yet. The ones I get are "com.tns.NativeScriptException: Attempt to use cleared object reference id" and "com.tns.NativeScriptException: No weak reference found. Attempt to use cleared object reference id=".

@gsmedley Just so I'm clear about the risks of using "markingMode: none" I have a plug-in for scanning an id using the camera. I allocate a java object and pass a callback into it:

var scl = new com.kanayo.idscanner.ScannerListener(scanListener)
let scanner = new com.kanayo.idscanner.IdScanner()
scanner.startScan( application.android.foregroundActivity, view, scl );
return scanner;

Is it possible that the com.kanayo.idscanner.IdScanner object will be collected prematurely? How would I prevent it?

@pana-cc Yes, absolutely! The JavaScript object holding ScannerListener is no longer referenced from JavaScript as long as you return the scanner, so the JavaScript instance may be collected by V8, and then anything kept in scanListener's properties or closures will be eligible for collection. There are pretty much 3 cases I've met so far:

  1. Global listeners that are assigned and expected to be alive for the lifetime of the application. (Like listening for push notifications) These listeners are best kept in module variables, and these variables should be accessed by exported member of the modules. (this has room for improvement, it would have been better to keep these on the module., but currently the runtimes hold only the exports object alive).
  2. {N} View classes - any native views or listeners should be kept in private properties on your View class so the lifetime of its native objects will be tied to the JavaScript View instance.
  3. Short living listeners (like the one in your case). When you create such listeners you have to store them in a Map or Set that is kept alive in a variable in the module and this variable is accessed in exported module member. Then when the listeners fire (like the setTimeout implementation), or if they are no longer observed (you can create observable and override add/remove listener, so when javascript event listener is added you can create the native listener and when all JavaScript listeners are remove - remove the native listener form the Map/Set), or are explicitly stopped (like the implementation of setTimeout/clearTimeout) - unsubscribe the native listeners and clear them from the map. I can give you further directions if you share the public API of this scanner module. Does it export an observable or perhaps a start() stop() pair?

@gsmedley Thank you - this is really great information! The scanner's Java API, it's Android-only, looks like this:

public interface ScannerListener {
void success( String scannedId );
void error( String errorMessage );
}
public class IdScanner {
public void startScan(Activity activity, View view, 
ScannerListener listener,
String candidateExpression, String verifyExpression, 
String checkSumClassName,
String direction, String flashMode );
public void stopScan();
public String changeCameraDirection();
public String toggleFlash();
}

You call startScan() - it shows the camera in the view, then calls back when it scans an id. From there you call stopScan(). We have a couple of buttons on the page to toggle the flash and switch the camera and cancel the scan. To protect the ScannerListener JS objects we could put them in an array that is exported from the module, then return the array index to the caller. When the caller calls stopScan() it should have the index as a parameter so we know which listener to free up. Does that sound right? Would it work to return the scanner and the listener together, like: return { scanner: scanner, listener: scl } Thanks again for all your help with this.

@pana-cc Great, pretty much that's it. One last thing, simply returning the scanner and listener will move the responsibility for the lifetime of these from the module developer to the application developer. Plugin developers must avoid this. In your particular case it would be nice to create and return a good object model when "start" is called, it can have the scanned and listener in private fields, expose stop, changeCameraDirection, toggleFlash etc. but also keep that returned object in array in the module until its "stop" is called, where the listener and scanner are stopped and safely disposed. That way app developers don't have to worry about object lifetimes and will continue writing JavaScript care free the way they were used to, limiting the impact of markingMode: none to plugin developers only.

@gsmedley Thanks so much for the advice! I've converted our plugins to follow this model and have not seen a "can't find object with id" crash yet. I've been running monkey tests pretty much non-stop, but haven't gotten past 2 days continuous running due to a couple of crashes. There is one very odd crash we are seeing in the RadChartView and I'm wondering if it is related to your point #2 "({N} View classes)" above. After running thousands of the same graphs there is suddenly a ClassCastException in code that for which that looks impossible. It's this issue: https://disq.us/url?url=https%3A%2F%2Fgithub.com%2Ftelerik%2Fnativescript-ui-feedback%2Fissues%2F618%3AekR4W2g5TsgRCwqDs6cc3sNsvw8&cuid=5193611

@pana-cc I don't have access to the pro-ui code at the moment, I have to reset some two way authentication tokens to get it. But the error seem to be thrown in pure Java code, something like wrong type cast. I've talked with some people internally to rise that issue's priority.

@gsmedley Thanks so much for doing that! At the moment this is our only blocker. With the graphs excluded we've been running monkey tests for 4 continuous days without a crash.

pana-cc avatar Apr 17 '18 06:04 pana-cc

After 5.1 all plugins owned by the NativeScript Team should be compatible with the markingMode: none option in their latest versions. I see that the app in question is using the nativescript-pro-ui, so upgrading to the latest packages of the nativescript-ui-* might help partially for this issue as well.

etabakov avatar Dec 14 '18 15:12 etabakov

We're still seeing Attempt to use cleared object reference errors, quite often. Has any work been done to improve the error message? It's hard helping you guys when we have no pointers to the cause.

This should be a priority IMHO.

larssn avatar Jan 16 '19 10:01 larssn

@larssn Which plugins are you using?

gsmedley avatar Jan 16 '19 11:01 gsmedley

These:

nativescript-angular
nativescript-feedback
nativescript-iqkeyboardmanager
nativescript-orientation
nativescript-phone
nativescript-plugin-firebase
nativescript-theme-core
nativescript-ui-listview
nativescript-ui-sidedrawer
nativescript-webview-interface

Just had one when we rotated the screen. Generally they all seem periodic in nature.

larssn avatar Jan 16 '19 12:01 larssn