junit5 icon indicating copy to clipboard operation
junit5 copied to clipboard

Handle Kotlin Function Names Correctly

Open jlink opened this issue 4 years ago • 29 comments

In "normal" cases Kotlin function names map directly onto Java method names. However, there are a few cases where the Kotlin source name of a function will be different than the generated method name in byte code:

  1. in internal classes
  2. with parameters of specific types present - e.g. UInt, UShort etc.
  3. There might be more...

As a test engine writer I stumble upon this problem in a few classes:

  • org.junit.platform.engine.discovery.MethodSelector which - when called from IntelliJ's test runner - provides the correct Kotlin function name, but cannot resolve the underlying Java method.
  • org.junit.platform.engine.support.descriptor.MethodSource which cannot differentiate between methodName and getJavaMethod()

Deliverables

  • Implement MethodSelector so that it works correctly with all Kotlin function names
  • Implement MethodSource so that it will correctly produce methodName (Kotlin name) and getJavaMethod().

Alternatively, allow a test engine to change the behaviour of MethodSelector and MethodSource.

Addendum

A similar problem may exist with Kotlin class names, e.g. internal or inner classes, but I haven't checked.

jlink avatar Oct 17 '21 10:10 jlink

@jlink Thanks for raising the issue! Could you please provide concrete examples of such cases or a link to the specification or docs for this behavior?

marcphilipp avatar Oct 17 '21 16:10 marcphilipp

class KotlinExperiments {

    internal fun internalFunction() {
    }

    fun functionWithUShortParameter(aShort: UShort) {
    }

    @Test
    fun showNames() {
        println(this::internalFunction.javaMethod?.name)
        println(this::functionWithUShortParameter.javaMethod?.name)
    }
}

As for internal functions "name mangling" is mentioned in https://kotlinlang.org/docs/java-to-kotlin-interop.html#visibility It looks like internal functions get a "$..." postfix. The exact postfix has changed through Kotlin versions

As for functions with unsigned int parameter types I did not find any specification. What I empirically found out is that names of those functions have a "-abcdefg" appendix, ie dash followed by 7 characters.

The problem with both is that you can create the same names using backticks, e.g.

    fun `internalFunction$kotlin`() {}

which will even lead to a runtime error (not a compiler error in Kotlin 1.5.30) since there are two methdos with same name :-(

jlink avatar Oct 17 '21 17:10 jlink

As a temporary workaround test engine maintainers can handle the MethodSelector issues in each engine. MethodSource however cannot be subtyped due to missing non-private constructor. That's why a MethodSource factory method that can differentiate between methodName and javaMethodName would also be welcome.

jlink avatar Oct 18 '21 06:10 jlink

Are there any Kotlin APIs we could call to determine the source method name? Does Kotlin add anything to the bytecode to reverse mangling, e.g. an annotation?

marcphilipp avatar Oct 29 '21 10:10 marcphilipp

The Spring Framework uses APIs in kotlin-reflect to introspect Kotlin code from Java.

sbrannen avatar Oct 29 '21 11:10 sbrannen

If you’re on the Kotlin side you can find out the name through Kotlin‘s own reflection mechanism. The bytecode , however, does not reflect the manipulation, as far as I could find out.

jlink avatar Oct 29 '21 11:10 jlink

Not sure if I got this clear: Do you execute your engine (and hence the JUnit platform) in a Kotlin project or in a Java project? Would it even matter?

mmerdes avatar Oct 29 '21 12:10 mmerdes

It only matters in a Kotlin project.

jlink avatar Oct 29 '21 12:10 jlink

And if some kotlin code under test was compiled elsewhere and is now tested in a Java project?

mmerdes avatar Oct 29 '21 12:10 mmerdes

Then the test classes are Java classes that don’t have this issue.

jlink avatar Oct 29 '21 12:10 jlink

If you’re on the Kotlin side you can find out the name through Kotlin‘s own reflection mechanism.

Is the use case here to look up a Kotlin method (from a MethodSelector) given its bytecode name?

marcphilipp avatar Oct 29 '21 18:10 marcphilipp

Is the use case here to look up a Kotlin method (from a MethodSelector) given its bytecode name?

For MethodSelector it’s the other way round: It contains the Kotlin method name but should provide the corresponding byte code method. For MethodSource, the engine only has a bytecode name, if it could find one, but the interpreting party (usually the IDE) needs the Kotlin function name to jump to the right place in the source code.

jlink avatar Oct 29 '21 20:10 jlink

Any news on this one?

Here's how jqwik currently tries to determine the "real" name of a Kotlin function:

public static String javaOrKotlinName(Method targetMethod) {
	String name = targetMethod.getName();
	if (isKotlinClass(targetMethod.getDeclaringClass())) {
		if (isKotlinInternal(targetMethod)) {
			name = nameWithoutInternalPart(name);
		}
		if (isKotlinSpecial(targetMethod)) {
			name = nameWithoutSpecialPart(name);
		}
	}
	return name;
}

private static boolean isKotlinInternal(Method method) {
	if ((method.getModifiers() & Modifier.PUBLIC) == 0) {
		return false;
	}
	// Kotlin appends "$<module-name>" to internal method names in Java bytecode
	// However, it's not necessarily the module name available from Class.getModule().getName() in Java
	int lastDollarIndex = method.getName().lastIndexOf('$');
	return lastDollarIndex > 0 && lastDollarIndex < (method.getName().length() - 1);
}

private static boolean isKotlinSpecial(Method method) {
	// Kotlin appends a 7-char-extension separated by a hyphen to method names in special cases
	// e.g. when method has UInt parameter, the original method is appended with '-WZ4Q5Ns'
	// TODO: Find out what's really happening here
	String name = isKotlinInternal(method) ? nameWithoutInternalPart(method.getName()): method.getName();
	int lastIndexOfHyphen = name.lastIndexOf('-');
	return lastIndexOfHyphen >= 0 && lastIndexOfHyphen == (name.length() - 8);
}

// Learned mechanism to detect Kotlin class in https://stackoverflow.com/a/39806722
private static boolean isKotlinClass(Class<?> aClass) {
	for (Annotation annotation : aClass.getDeclaredAnnotations()) {
		if (annotation.annotationType().getTypeName().equals("kotlin.Metadata")) {
			return true;
		}
	}
	return false;
}

private static String nameWithoutInternalPart(String name) {
	int lastDollarPosition = name.lastIndexOf('$');
	return name.substring(0, lastDollarPosition);
}

private static String nameWithoutSpecialPart(String name) {
	int lastDollarPosition = name.lastIndexOf('-');
	return name.substring(0, lastDollarPosition);
}

jlink avatar Dec 02 '21 11:12 jlink

That looks like a good start! Would you be interested in submitting a PR that adds support for this to MethodSelector and MethodSource?

marcphilipp avatar Dec 02 '21 13:12 marcphilipp

To be frank, a full blown PR with all the formatting and style rules of JUnit is a lot of hassle. I can offer a solution sketch (tests included of course), if someone else is willing to take care of the enforced policies.

jlink avatar Dec 02 '21 15:12 jlink

Sure, we'll take what we can get! 🙂

marcphilipp avatar Dec 03 '21 15:12 marcphilipp

What is the desired solution architecture at the moment?

Is the idea to implement a separate module based on kotlin and maybe kotlin-reflect that complements the platform module? Or is the current solution architecture to mangle and unmangle names?

mpkorstanje avatar Dec 12 '21 00:12 mpkorstanje

@mpkorstanje What I had in mind is to treat function name handling in the existing platform module differently if the target class is a Kotlin class. This is indeed a somewhat mangled design, but every test engine would profit automatically.

A more modularized approach would be - as you suggest - to differentiate different target languages in different platform modules. My guess is that this will require cutting the monolithic platform-launcher module into several parts. This is nothing that should be done in a single PR, moreover, a thorough discussion should precede such a step with the core team having the lead in this discussion. IMO, there is no need to delay a simple but working solution for this goal. On the contrary, having a working solution and showing how this would look in a modularized platform could even be beneficial for the discussion.

What do the @junit-team members think?

jlink avatar Dec 12 '21 11:12 jlink

What is the desired solution architecture at the moment?

I think it depends on what we need. If we need to add a dependency a non compile-only dependency on Kotlin, we should definitely put it into a separate module. If we can get a way with a few checks using Java reflection, I don't see a need for another module.

marcphilipp avatar Dec 18 '21 17:12 marcphilipp

If you want the perfect solution, a dependency on Kotlin is required because only Kotlin really knows how Kotlin names are being handled. The translation to bytecode methods is not specified - at least I could not find it. If a best-guess solution suffices, then one can get away without a dependency and use code along the lines I sketched above.

jlink avatar Dec 20 '21 07:12 jlink

Could I try this one ? @jlink

fll02020 avatar Apr 23 '22 09:04 fll02020

Could I try this one ? @jlink

Sure. I was waiting for more input from the JUnit Team about which kind of solution they’d prefer. Apart from that I’m happy about anyone doing the implementation.

jlink avatar Apr 23 '22 09:04 jlink

Could I try this one ? @jlink

Sure. I was waiting for more input from the JUnit Team about which kind of solution they’d prefer. Apart from that I’m happy about anyone doing the implementation.

Thank you very much. I just read your code. To be frank, I think you done most of the work. Maybe I just need to continue to complete every special type(e.g. Uint, Ushort)and change the code style? Or is there something here that I don't understand what you mean?

fll02020 avatar Apr 23 '22 10:04 fll02020

@fll02020 Give it a try. Maybe you can get hold of the Kotlin source to check what’s really happening with those special types UInt etc.

jlink avatar Apr 23 '22 13:04 jlink

@jlink For special type, I found some rules of it. The 7 characters seems like a hashcode. But the information before hash() may contain too much , it doesn't seem like we can just use methodParameterTypes. To be honestly, I haven't even been able to replicate from UInt to WZ4Q5Ns via hash. Could you give me some advice? For internal function, I found your rules maybe not so correct. In my experiments, kotlin is actually added and module name. it is not always $kotlin this is the link for code: https://github.com/JetBrains/kotlin/blob/92d200e093c693b3c06e53a39e0b0973b84c7ec5/compiler/backend/src/org/jetbrains/kotlin/codegen/state/inlineClassManglingUtils.kt this is the link for document https://github.com/Kotlin/KEEP/blob/master/proposals/inline-classes.md#mangling-rules

fll02020 avatar Apr 24 '22 07:04 fll02020

@fll02020 I'm afraid I don't know more than you do. My code is just handling the examples I stumbled upon by accident. I'd personally try to cover the cases that can be derived from the mangling rules you digged out.

The problem is, of course, that it's not clear to me if those mangling rules are part of the spec or just an implementation detail. If the latter, the only real option is to use Kotlin itself. This would, however, require to somehow modularize the platform launcher, which is supposedly a major effort.

jlink avatar Apr 26 '22 06:04 jlink

Just stumbled upon this article: https://ncorti.com/blog/name-mangling-in-kotlin

The problem remains that the name mangling result for internal method can also be created by a back-ticked function name, e.g.:

fun `looks to java$like internal method`() {
}

Since the module name used by Kotlin is not necessarily the same as Java's Class.getModule().getName() you cannot be sure that a mangled name really is a mangled name :-( Not without using Kotlin's own reflection, anyway.

jlink avatar May 06 '22 12:05 jlink

That is unfortunate but it means we can decide on solution architecture.

  • We could introduce a KotlinMethodSelector variation on the MethodSelector and have API users use this when dealing with Kotlin (presumably they know).

  • We could make the code that sits currently MethodSelector.lazyLoadJavaMethod smarter. Where it tries every "known language" using a reflection module provided by SPI specifically for that language until it finds a method (what if we find more then one method?).

  • If we're using a module, ideally it does something like Spring Autoconfiguration where it only activates if/when Kotlin is available. Otherwise end users still have to add an extra dependency.

https://github.com/junit-team/junit5/blob/bd39c9d8329289d57044e36e897fe5b2a5ecc474/junit-platform-engine/src/main/java/org/junit/platform/engine/discovery/MethodSelector.java#L162-L180

mpkorstanje avatar May 06 '22 13:05 mpkorstanje

  • We could introduce a KotlinMethodSelector variation on the MethodSelector and have API users use this when dealing with Kotlin (presumably they know).

I'd advise against that. Having seen the slow uptake of the current platform API by many tools it would probably take years.

  • We could make the code that sits currently MethodSelector.lazyLoadJavaMethod smarter. Where it tries every "known language" using a reflection module provided by SPI specifically for that language until it finds a method (what if we find more then one method?).

Don't forget that it's not just MethodSelector.lazyLoadJavaMethod but also MethodSource.getMethodName() and maybe a few other places. There are at least two ways how the problem reveals itself in the first place:

  1. The caller (e.g. IDE) uses the JVM bytecode method (MyTest.internalTest$moduleName). This is how IntelliJ does it and it requires the engine to produce a TestDescriptor with either an explicit name or a MethodSource instance that can produce the correct name. This is actually a problem that requires deeper investigation b/c the two engines I know best (Jupiter and jqwik) determine the actual display name of a test descriptor through method.getName(). No amount of code in the platform can change that :-(

  2. The caller uses the Kotlin name (MyTest.internalTest) which will then lead to lazyLoadJavaMethod failing to produce a method at all. Finding the correct JVM method is then only part of the solution since the display name problem from above will also strike.

  • If we're using a module, ideally it does something like Spring Autoconfiguration where it only activates if/when Kotlin is available. Otherwise end users still have to add an extra dependency.

Any idea how that could look work without special Gradle and Maven support?

There's also a third design option: Just go with best guess as I currently do in jqwik. People using strange method names would see strange effects, but that may be the lesser evil.

jlink avatar May 07 '22 07:05 jlink