binary-compatibility-validator icon indicating copy to clipboard operation
binary-compatibility-validator copied to clipboard

Validate KLIB ABI compatibility on Multiplatform

Open fzhinkin opened this issue 2 years ago • 13 comments

There's an API to verify KLIB's binary compatibility that'll be released soon: https://youtrack.jetbrains.com/Issue/KT-54402 https://github.com/JetBrains/kotlin/blob/master/compiler/util-klib-abi/ReadMe.md

BCV should use it to support KLIBs validation in addition to JVM bytecode validation.

fzhinkin avatar Sep 01 '23 14:09 fzhinkin

Things that are worth keeping in mind (or maybe even split into a few subtasks):

Dependency on the Kotlin compiler

  • We can hardcode the dependency for the first version, but then we'll have to update it each Kotlin release
  • We can later provide a plugin extension that takes the dependency of the same version as KGP applied to the project

Default behaviour and migration path

  • When we should enable Klib dump by default
  • What is the naming scheme/migration path for non-JVM dumps, and how are they different (or the same?) across platform-specific, JS IR and WASM source-sets

qwwdfsad avatar Sep 01 '23 15:09 qwwdfsad

I would recommend to use v2 of binary signatures (with mangled names included into the rendered signature text) vs v1 (with only hashes).

ddolovov avatar Sep 01 '23 15:09 ddolovov

What is the naming scheme/migration path for non-JVM dumps, and how are they different (or the same?) across platform-specific, JS IR and WASM source-sets

I see a few options here:

  1. use unified naming for both jvm and non-jvm target: api/<target>/<projectName>.api
  • pros: we already use such naming if a project contains multiple jvm-targets;
  • cons: for an existing multiplatform project it means that an old api-file for a jvm-target need to be moved to jvm subdirectory.
  1. for jvm targets don't do anything, for native targets use naming scheme mentioned above
  • pros: much pleasant migration path as there's no need to update jvm-specific dumps
  • cons: there will be a dependency between the jvm'a apiDump task and native tasks, need to look carefully how to deal with it
  1. dump klibs to their own abi-directory: abi/<target>/<projectName>.abi
  • pros: easier to implement (compared to (2))
  • cons: adds some complexity by introducing an additional directory for dumps: its unclear why KLIB ABI dumps deserved a different top-level directory

It's also unclear if there's anything we should do about identical KLIB ABI dumps across different native targets. For example, for kotlinx-io-core, dumps for linuxX64, linuxArm64, mingwX64 and all androidNative*-targets are identical, as well as all dumps for apple-targets are identical. I don't really like the idea of storing identical files within the project and I don't like the idea of updating all these dumps once an API changed even more. But at the same time it's unclear if having identical ABI dumps for almost all targets is a common case (I would guess that it is) and how to group these dumps (assuming that they can diverge in the future). It's tempting to group dumps by the source sets instead of targets, but a target may use multiple source sets. For me, it's an open question and I can't name a good solution at the moment.

fzhinkin avatar Oct 02 '23 17:10 fzhinkin

My vote is for option #1, hidden under the flag initially (because this functionality is going to be wildly unstable). Cons are not that serious -- this will be a one-time change.

It's also unclear if there's anything we should do about identical KLIB ABI dumps across different native targets.

Probably we'll need to dump ABI for configured source-sets, not targets, implying that different targets in the same source-sets are identical

qwwdfsad avatar Oct 03 '23 09:10 qwwdfsad

Probably we'll need to dump ABI for configured source-sets, not targets, implying that different targets in the same source-sets are identical

A downside of such an approach is that a logic of searching for a source set dominating multiple targets tends to be fragile (it could work fine for default hierarchies, but I'm afraid that for non-default hierarchies we may end up with a bunch of unique source sets tied to targets). And it also doesn't match how we generate dumps for JVM.

fzhinkin avatar Oct 10 '23 09:10 fzhinkin

So far, here's how it going to work:

  • KLib validation will be disabled by default, to enable it one has to set klibValidationEnabled flag in apiValidation settings:
apiValidation {
    klibValidationEnabled = true
}
  • when enabled, KLib validation will use v2-signatures in dumps, it could be changed using klibSignatureVersion option:
apiValidation {
    klibValidationEnabled = true
    klibSignatureVersion = 1 // if validated klib doesn't contain appropriate signatures, an exception will be thrown during the validation
}
  • KLib validation supports ignoredPackages, ignoredClasses and nonPublicMarkers options the same way classfiles validation does, but all public* options are unsupported and will be ignored.

Right now, dumps will be generated (and verified) for each configured target that is supported by the host.

fzhinkin avatar Oct 18 '23 09:10 fzhinkin

In the future we might implement v3 of signatures. Do I get it right that no changes will be required in the compatibility validator, it will support new version of signatures just out of box and all that will be actually needed is that the user should specify klibSignatureVersion = 1 in Gradle build file?

ddolovov avatar Oct 18 '23 15:10 ddolovov

@ddolovov You got it right: once v3 of signatures is available, users need only specify klibSignatureVersion = 3 to switch from a version they used previously.

fzhinkin avatar Oct 19 '23 08:10 fzhinkin

After trying klib validation on several projects I came up with a solution for the problem with multiple identical files generated for each native target (described at the end of https://github.com/Kotlin/binary-compatibility-validator/issues/149#issuecomment-1743439609).

Instead of simply saving all text dumps for native targets, BCV will merge them into a single file. If a declaration was presented in all original dumps, it will be written to the merged dump without any changes. If a declaration is specific to a subset of native targets, in the merged dump it will be annotated with all the targets it is specific to.

Here's how the merged dump will look like: https://raw.githubusercontent.com/Kotlin/binary-compatibility-validator/merge-klib-dumps/src/functionalTest/resources/examples/classes/AnotherBuildConfigLinuxArm64Extra.klib.dump The extra comment with list of all targets whose dumps were merged is added at the very beginning of the file and then the last declaration, specific to linux-arm64, is annotated with its target.

Such format amplifies the problem existing before, but that was not discussed yet: at the moment, only Apple hosts (well, only macOS) supports compilation for all target platforms supported by Kotlin/Native, if a developer is using Klib validation on the Linux or Windows workstation, they won't be able to neither dump, not verify ABI for all the targets. To overcome this issue, following solution is proposed:

  • by default, both dump and validation will fail if host does not support at least one of required targets;
  • this behavior could be optionally relaxed to printing a warning (using special configuration options);
  • to ease the development two additional tasks will be added: one will use heuristics to dump ABI for all targets, even if targets are unsupported by the host compiler and another task that could validate the ABI against the golden file using same heuristics while generating the dump.

It's crucial that latter two tasks will not be executed along with apiDump and apiCheck tasks, a developer have to explicitly call them. These tasks aimed only to ease the development process on non-macOS hosts and may generate invalid dumps, so these tasks should be used with care and are not considered as a replacement for regular dump and check tasks. For the beginning, heuristics to guess the dump will be simple:

  • if there is a target supported by the host compiler and using exactly the same source sets as the unsupported target (for example, both use only the common source set) - take the dump for that target and use it as a replacement for unsupported target's dump;
  • if all supported targets use slightly different source sets than a unsupported target, just pick a one sharing the most of source sets (yeah, not really intelligent way to define a similarity between two targets) with unsupported and use it's dump as a replacement. The second heuristic will be disabled by default and will cause generate-a-fake-dump task's failure if the first heuristic won't pass.

Please let me know if you see any issues with the update described above or share you suggestions if you know how to tackle aforementioned issues better.

fzhinkin avatar Dec 19 '23 17:12 fzhinkin

if a developer is using Klib validation on the Linux or Windows workstation, they won't be able to neither dump, not verify ABI for all the targets

I'm such a developer, and I very much need to generate ABI dumps when committing source changes (my CI will check all targets). So those heuristic-based tasks would indeed be very welcome.

But something bugs me here. Why can't we generate actual ABI dumps for Apple targets on Windows (not heuristic-based)? My IDE can compile fine the Apple-specific code thanks to the CompleteKotlin Gradle plugin (which downloads Kotlin bindings for the Foundation framework and other platform-specific stuff). The current KGP doesn't download those bindings by default, but it could, and it seems compilation could work properly too.

I'm not expecting to run any Apple target tests on my windows machine, but the CompleteKotlin plugin sort of proves that there is no technical reason for this limitation (the Kotlin bindings are provided by JetBrains and don't require a mac).

joffrey-bion avatar Mar 04 '24 18:03 joffrey-bion

Why can't we generate actual ABI dumps for Apple targets on Windows (not heuristic-based)?

At this moment, Apple targets are not supported by the Kotlin compiler on host platforms other than MacOs, so there's no way to emit klibs for these targets (thus there are no klibs to extract dumps from).

I didn't had a chance to check the CompleteKotlin plugin yet, but it seems like it only solves an issue with Idea autocompletion, as one still has to compile final binaries on a proper host. And BCV needs a klib.

fzhinkin avatar Mar 05 '24 14:03 fzhinkin

Thanks for your prompt answer, that clarifies things a bit. Indeed CompleteKotlin is about IDE completion, but I thought this implied we were able to run the embedded compiler from the Kotlin IDE plugin so it can highlight things in the editor. I'm probably missing a bit too much context for this. But in any case thanks for this issue.

joffrey-bion avatar Mar 05 '24 17:03 joffrey-bion

but I thought this implied we were able to run the embedded compiler from the Kotlin IDE plugin so it can highlight things in the editor.

TBH, I don't have an answer right now. Need to investigate how the things work there.

fzhinkin avatar Mar 11 '24 12:03 fzhinkin