kotlinx.coroutines icon indicating copy to clipboard operation
kotlinx.coroutines copied to clipboard

NoSuchFieldException with AtomicReferenceFieldUpdater on Samsung Android 5.0.x devices

Open asfdfdfd opened this issue 6 years ago • 52 comments

This is known error.

Some libraries have not done anything (https://github.com/google/gson/issues/924) other ones have implemented workaround (https://github.com/ReactiveX/RxJava/issues/3459).

Couple of stacktraces from my project: com.mirrorai.app_issue_crash_5D5E4CFA0353000119B5FCFF34F93B11_DNE_0_v2.txt com.mirrorai.app_issue_crash_5D61DBF600A40001134DD9BDD9BD8B29_DNE_0_v2.txt

I'm not convincing you to implement workaround like RxJava did but i think that at least it will be convenient to keep this issue to prevent questions from other users of coroutines library.

Proguard is turned off.

Coroutines version is 0.24.0. Kotlin version is 1.2.60.

asfdfdfd avatar Aug 13 '18 16:08 asfdfdfd

Let's discuss how we might approach working around this problem. We don't directly use Atomic*FieldUpdater classes. We rely on https://github.com/kotlin/kotlinx.atomicfu to "include" them via bytecode transformation. It is relatively straightforward for us to create a new transformation variant that changes atomic fields to a regular Atomic* classes. But how do we deliver the corresponding binaries? I see two options:

  1. Deliver a single multi-release jar that have "Samsung-friendly" classes (using only Atomic*) in the main code and VarHandle version in META-INF/version/9 for server-side folks running on JDK 9+.

Pro: Everything works for everybody by default. Con: Anyone running on JDK 8 or anyone targeting non-affected Android >5.0 will get slightly worse performance than they have now(sic!), because of additional dereference in Atomic* classes versus optimized Atomic*FieldUpdater code that we have now.

  1. Deliver a single multi-release jar that have Atomic*FieldUpdater classes in the main code and VarHandle version in META-INF/version/9 and a separate "Samsung-fiendly" jar file with a different classifier.

Pro: By default if you depend on kotlinx-coroutines-core:<version> you get the best possible performance for your platform. Con: If you target Android <= 5.0 with affected devices, you'll have to specify kotlinx-coroutines-core:<version>:samsung-workaround in your dependencies (might need a better name for a classifier).

What do you think?

elizarov avatar Aug 13 '18 16:08 elizarov

Additional question: Anyone knows more details on the conditions under which Samsung devices crash with Atomic*FieldUpdater? What if we make those fields public, for example? Would it help? Are there any details on what exactly is broken on those Samsung Android 5.x devices?

elizarov avatar Aug 13 '18 17:08 elizarov

It seems it happens on Samsung Android 5.0.x, not 5.x. If I'm not mistaken, the issue title should be edited accordingly.

Regardless, I think @digitalbuddha may have some additional information (or not) about this issue, he said he reached out to Samsung.

LouisCAD avatar Aug 13 '18 20:08 LouisCAD

I was never able to get any info from Samsung as to why this issue occurred. Once rxjava applied their fix we never saw issue again

digitalbuddha avatar Aug 13 '18 20:08 digitalbuddha

I strongly dislike the idea of making kotlinx.coroutines performance worse on all devices and JDK < 9 projects just because Samsung f*****d up.

Android app bundles may, probably with Google's collaboration, support different compiled code for different android versions. If this can be done in some way, then apps published with app bundles may use a version free of problematic code for API 21 devices, and use the regular version for all other versions (API 20 and lower, and API 22 and up).

This would work with the second option @elizarov mentioned.

I know this is more complex than the RxJava approach, but it's less compromising.

Another solution is to compile a version of the app with the samsung workaround with a minSdk and a maxSdk of 21, and publish two other versions, one for maxSdk 19 (20 being Android Wear…), and one for minSdk 22. But here again, I think it should be automated using app bundles (which can probably be extended since they use protobuf internally), so we don't have to manage three different version codes.

Where I'm starting to wonder how a samsung api 21 workaround version would work though, is when it comes to libraries. I don't want to have to publish multiple versions of my android libraries that use kotlinx.coroutines because of Samsung, and I would even less want to do it on a non Android library. Would there be a way to avoid transitive dependencies hell when using the samsung workaround?

LouisCAD avatar Aug 13 '18 20:08 LouisCAD

In the case 2, the kotlinx-coroutines-core:<version>:samsung-workaroundis the only viable option for an Android developer

fvasco avatar Aug 13 '18 20:08 fvasco

I continue to vote for doing nothing. This is not a library's problem.

If someone wants to fix this, write an Android Gradle plugin transform that bytecode-rewrites these into the slower format such that you fix all libraries at once and can opt-in and out based on things like minSdkVersion, splits, and bundles.

JakeWharton avatar Aug 13 '18 20:08 JakeWharton

@JakeWharton I think you're right. I just submitted an issue on Android's issue tracker for this fix to be integrated into Android Gradle plugin: https://issuetracker.google.com/issues/112526256

LouisCAD avatar Aug 13 '18 20:08 LouisCAD

That's not quite what I said... AGP already has a pluggable bytecode transformation model so this doesn't need to (and likely won't) be implemented by Google. Anyone can do this who is willing. Samsung should do it, actually.

JakeWharton avatar Aug 13 '18 20:08 JakeWharton

Google certified these devices, and Samsung didn't even respond to Mike when he was working for New York Times.

So "anyone" willing can do it (as long as you are fluent in bytcode among other things), yes. However, I think at the moment, only Google can integrate such a solution targeting only certain devices with app bundles, so other devices (or other API levels at least) don't pay the performance hit.

LouisCAD avatar Aug 13 '18 21:08 LouisCAD

Transforms have access to variant information so again I disagree and hope the Android Gradle team doesn't do it. Variants have the API information which is the best you'll get in terms of targeting. It has nothing to do with app bundles other than that makes it easier to deal with the API splits.

JakeWharton avatar Aug 13 '18 21:08 JakeWharton

hope the Android Gradle team doesn't do it

Why so?

LouisCAD avatar Aug 13 '18 21:08 LouisCAD

How would you make such a bytecode transform targeting only API 21 Samsung devices work in practice without app bundles? I mean, without the burden of having to setup the build manually to produce multiple apks.

And if using an app bundle to separate the dex files having the fix for Samsung API 21 and the ones not needing it, how would it work in practice? I know zero documentation that explains this, and I don't even know if current app bundle spec supports it.

LouisCAD avatar Aug 13 '18 21:08 LouisCAD

It doesn't, which is why I said it wasn't related to bundles. If the variant has a minSdk lower than 23 you fix the bug. That's it. All of this can be done with variants and transforms already. the AGP team has far more impactful work to do. This would have already been fixed in AGP years ago if the problem was pervasive enough. It isn't, and I'm actually helping your case by not fixing libraries which further proves the rarity of occurrence and justifies their ignorance of the problem.

JakeWharton avatar Aug 13 '18 21:08 JakeWharton

It's minSdk lower than 22 actually, this issue affects only Android 5.0.x (API 21) Samsung devices. I added a comment to the issue on Android's issue tracker to request ability to split per API levels for bundle built apps, so anyone can make a gradle plugin fixing this, without making the performance worse for non affected API levels.

LouisCAD avatar Aug 13 '18 21:08 LouisCAD

The problem is that it is hard (and in general impossible) to write a universal bytecode transformer that takes code using Atomic*FieldUpdater classes and rewrites it into Atomic* classes. However, it is not that hard for us as a part of atomicfu project, because we control what kind of code patterns are allowed around atomics, so we can define a new transformation variant and produce a separate version of the jar for those users of kotlinx.coroutines that need to target buggy Samsung 5.0.x devices. Doing nothing is not a great option for us, since we don't want to let our Android users down, so reading this feedback I'm inclining to go with option two.

We can even call the alternative classifier android and setup up dependencies in such a way that when you depend on kotlinx-coroutines-android:<version>, then it transitively brings kotlinx-coroutines-core:<version>:android with it.

Yes, yes, I know that Android did nothing wrong and it is all Samsung's fault, but I don't think that a bit of performance degradation due to replacement of Atomic*FieldUpdater with a separate Atomic* instances would be really noticeable in a typical Android coroutine usage scenarios.

elizarov avatar Aug 14 '18 09:08 elizarov

We can even call the alternative classifier android and setup up dependencies in such a way that when you depend on kotlinx-coroutines-android:<version>, then it transitively brings kotlinx-coroutines-core:<version>:android with it.

I have trouble understanding that part.

Do you mean that we'll be able to choose whether we use the version with the fix or not (depending on whether we support the devices in the build), and it'll automatically edit the transitive dependencies of libraries relying on kotlinx.coroutines?

LouisCAD avatar Aug 14 '18 09:08 LouisCAD

@LouisCAD Editing dependencies of other libraries is tricky. You'll have to manually exclude the dependency this library has and include the dependency you want. That is the downside of "alternative classifier" version. Maybe we can automate this part by writing a small custom Gradle plugin that does this replacement or maybe we can use Gradle metadata feature to do it. We'll need to study it better.

elizarov avatar Aug 14 '18 10:08 elizarov

IIRC, there's something called dependency resolution strategy in gradle that can make this easy enough.

On Tue, Aug 14, 2018, 12:14 PM Roman Elizarov [email protected] wrote:

@LouisCAD https://github.com/LouisCAD Editing dependencies of other libraries is tricky. You'll have to manually exclude the dependency this library has and include the dependency you want. That is the downside of "alternative classifier" version. Maybe we can automate this part by writing a small custom Gradle plugin that does this replacement or maybe we can use Gradle metadata feature to do it. We'll need to study it better.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/Kotlin/kotlinx.coroutines/issues/490#issuecomment-412823498, or mute the thread https://github.com/notifications/unsubscribe-auth/AGpvBWZYwiQXLyf1GvRP956eWMnRaOkdks5uQqLtgaJpZM4V61LM .

LouisCAD avatar Aug 14 '18 10:08 LouisCAD

So, I just got someone trying CameraCoroutines which uses kotlinx.coroutines 0.24.0 on a Samsung Galaxy S5 (SM-G900F) running Android 5.0, and it didn't crash with the error from this report, so it seems it doesn't affect all Samsung devices running Android 5.0.x.

Therefore, we need additional info to know which device models have been seen affected by this bug. @asfdfdfd and @digitalbuddha, could you give affected device models?

LouisCAD avatar Aug 16 '18 11:08 LouisCAD

Same story here. We've found an in house Samsung Android 5.0.x device and we had not managed to make it crash so far.

elizarov avatar Aug 16 '18 11:08 elizarov

This was 3+ years ago. Only one I have reference to is SM-T805Y

digitalbuddha avatar Aug 16 '18 11:08 digitalbuddha

I have no device that i could use to reproduce this crash but i've received it via Crashlytics from these devices: Galaxy S4 Galaxy Note3 Galaxy S5 Galaxy Alpha Galaxy Tab4 10.0

asfdfdfd avatar Aug 16 '18 13:08 asfdfdfd

@asfdfdfd You said you received it on Galaxy S5 devices, but we couldn't reproduce on one running Android 5.0 (model SM-G900F). Since Samsung makes multiple variants of their devices, could you share the exact model numbers?

Also, if you have any clue about which calls to kotlinx.coroutines cause the issue, please tell us.

LouisCAD avatar Aug 16 '18 13:08 LouisCAD

So, I got someone else test the same CameraCoroutines project using kotlinx.coroutines 0.24.0 on a Samsung Galaxy S4 (GT-I9505), running Android 5.0.1, and there was no issue on it, so I'm starting to wonder if all of kotlinx.coroutines is affected, or if only some part of it is, or if it's that Samsung was not consistent in its bug distribution.

If anyone has a clue about what call leads to the failure (whether on the Atomic*FieldUpdater or kotlinx.coroutines library), please tell so a reproducer can be setup.

LouisCAD avatar Aug 16 '18 15:08 LouisCAD

@LouisCAD

could you share the exact model numbers

Crashlytics has not any additional info about devices, but i was able to find something in Google Play Console:

Samsung Galaxy S4 (ks01ltelgt), Android 5.0 Samsung Galaxy S5 (klte), Android 5.0 Samsung Galaxy Note3 (ha3g), Android 5.0 Samsung Galaxy S4 (jflte), Android 5.0

if you have any clue about which calls to kotlinx.coroutines cause the issue

I have absolutely no idea. I've looked at all these places one more time and found nothing suspicious. One of crashes comes from something like:

class MyClass {
    init {
        launch { loadData() }
    }

   private fun loadData {
      // Non-coroutine code here.
   }
}

asfdfdfd avatar Aug 16 '18 17:08 asfdfdfd

I do really think build variants are the way to go as JakeWharton said. There are other issues related to Samsung that I am already using build variants to fix. It's so common in Android to have per API variants, that it is in the docs.

It may be solved using some plugin to do transformation or maybe just multiple jars, excluding transitive dependency with Gradle and choosing one jar for each variant, but the thing is, Android already has tools to deal with it.

App bundles don't look like it has anything to do with transforming and compiling itself, also it's quite new and most people I know don't use it yet.

Also, anyone has any statistics on, probably, how many people is affected right now? Maybe, I hope, it isn't worth the effort anymore.

Hazer avatar Aug 19 '18 21:08 Hazer

@asfdfdfd Could you reply to comment #3 here on Android's issue tracker?

Also, @elizarov, you can list the workarounds also requested in this comment on Android's issue tracker

LouisCAD avatar Aug 30 '18 12:08 LouisCAD

@LouisCAD done

asfdfdfd avatar Aug 30 '18 14:08 asfdfdfd

Okay, now i've got crash on Samsung Galaxy S5 with Android 6.0.1

http://crashes.to/s/9bcf5942ccd

asfdfdfd avatar Nov 22 '18 10:11 asfdfdfd