groovy GROOVY-10099: Resolve ambiguous varargs behaviour in dynamic mode by recourse to compiler-stage information

This is a replacement for PR #1591 which was accidentally closed on the occasion of my attempting to move the work into a branch of my fork, rather than master, so I can potentially work on other issues. There is discussion there.

Summary: It attempts to resolve the question of what happens when a varargs method is invoked with a single argument to the varargs parameter, which is null. When running dynamic code, Groovy has no way to distinguish whether a single null value is a null object reference (and should thus be wrapped in an array for a varargs) or a null array reference (and thus should not). At present it always does the latter, which leads to surprising behaviour when that argument, for example, comes from a method call with a known non-array return type, or even an explicit cast to a non-array type, and would be expected to go into a varargs as an array containing that single null.

At compiletime we have this information... at least some of the time. This PR attempts to make use of that information, in the case of the single argument, to examine the expression to determine if it's an array type or not, or dynamic and thus still impossible to resolve. And, when it's confident it's not dynamic nor an array type, to write the array-wrapping into the generated code then, but still leave it as a dynamic method call, so it could still be intercepted by an equivalent replacement of the method known to exist at compiletime.

While the tests pass, it's clearly a worry to be making such a deep change, especially as someone's first Groovy PR! On the other hand this change does make varargs methods in this situation behave consistently between @CompileStatic and @CompileDynamic code. @CompileStatic code is not altered in any way by this PR.

That said, there is an argument that @CompileStatic's current behaviour may not be correct anyway - but only really with respect to when the null in question is a null constant. The more useful reason for the change is to try to make it behave better when the null happens to be the value of a typed variable or method return type, or some other kind of expression with a knowable type, including and especially in fact, cast expressions, which may reflect the coder's attempts to assert precisely what should happen here.

More problematic, perhaps, is the way this PR breaks the ideal dynamic behaviour of only acting upon what information is available at runtime, and rather opportunistically borrowing information available only at compiletime (but also potentially made incorrect by runtime metaprogramming). There is a question as to whether that kind of change is desirable at all. On the other side, the fact remains that the question answered here cannot be answered at runtime. A null value retains no clue as to what type of reference it's a null of.

But there's a case that such behaviour shouldn't change now, regardless of whether it's right or wrong, for fear it would just break too much. I have every sympathy with that, and personally doubt this will ever get merged now, but the discussion wasn't definitively over, so recreating the PR for that sake.

The code changes here are identical to the original. In addition the branch has been caught up to the present state of apache/groovy:master. Nothing there appears to clash.

Jun 19 '21 12:06 StrangeNoises

Would it be possible to summarize the different scenarios surrounding variadic method calling and show where dynamic, STC and SC are giving expected or unexpected results? There is so much discussion under the issue and the previous pull request. It is difficult to understand the scope of the problem and also the solution does not seem to fit the size of the problem IMO.

We have been using compile-time type information as part of static compilation to resolve issues such as these. With that in mind, it can be expected that dynamic runtime behavior will not always match static compilation behaviors. I cite GROOVY-5438 as one example. We would need to apply the static compilation methodology to the dynamic case to make them work the same; there simply are not enough facilities available at runtime to provide total consistency.

Groovy adds one additional complication to variadic method calling. It treats foo(String[]) the same as foo(String...) in most cases. This means we cannot count on the varargs modifier at runtime as a signal. And I had to file a bug related to the unreliability of the varargs modifier: GROOVY-10140

Jun 21 '21 20:06 eric-milles

bit of churn from still trying to get the hang of rebasing onto upstream/master. Nothing really new so indifferent to the workflows running. tests still pass fwiw. shortened the commit message anyway... (did note feedback from eric in the week, but not yet ready to answer)

Jun 26 '21 13:06 StrangeNoises

I've tried to make it less scary. 😀

The scope of the change is reduced to only apply in @TypeChecked code, and brings that into line with @CompileStatic behaviour relating to single varargs arguments that might be null. The code has been considerably simplified as a result, as it is no longer trying to identify the relevant methods and property types itself (which was the part I was least confident about), but using the metadata values for these left behind by a @TypeChecked compile.

In fact this change does not explicitly limit itself to @TypeChecked but is simply taking advantage of the metadata if present, however it got there earlier in the compile, but that does seem primarily to be from StaticTypeCheckingVisitor at present. Should in future dynamic compilation also helpfully leave that metadata lying around, it will take advantage then too.

The VarargsMethodTest test has been expanded (though reduced from earlier versions of this PR) to show it working as expected for @TypeChecked and also illustrating with "shouldFails" where the @CompileDynamic behaviour differs. Essentially it treats any single null as an array, and only the tests that are really giving it an array expression pass, but only by coincidence. The @CompileStatic tests have been removed as its behaviour has not changed, but it is the same (now) as @TypeChecked. No existing tests have had to be altered to make them pass with this code change.

So this keeps the "purity" of the fully dynamic mode, only depending on runtime information, and defers to the future any more considered discussion of whether that's desirable in this situation. But now the "middle-ground" of @TypeChecked can be used to correct the behaviour as well as @CompileStatic.

The code here still explicitly makes null constants behave as non-arrays, to be consistent with @CompileStatic, but I recall earlier in the discussion @paulk-asert queried whether that was the correct behaviour. If minds should change, it's a simple change from true to false here to make it follow suit. As I've said before, I'm pretty indifferent to the fate of null constants, and only really care about whether explicit types are being respected, where given.

Jul 03 '21 13:07 StrangeNoises

The change is definitely more palatable in size. For expression type determination during classgen, I think the TypeChooser should always be used. The sequence of checking expression class types and getting type or node metadata should already be provided through TypeChooser.

Typically TypeChecked will provide extra checks but no changes to behavior. In this case, you are using STC info to alter the call semantics. Is there another example of this? StaticInvocationWriter exists specifically to provide changes for CompileStatic that are not applied for TypeChecked or regular compilation.

I have not read all the treatment for this issue. Has the idea of a compiler warning been considered when you have STC on and you can detect that the argument is one of the edge cases that will not be wrapped by an array at runtime? Such a warning could be implemented as a type checking extension.

Jul 06 '21 13:07 eric-milles

I'll look at the TypeChooser thing, next time I get back to it (probably this weekend). The code there that digs for the type of method calls and properties is a partial hangover from trying to get it to work in a completely dynamic context, where it certainly wasn't working if I just left it to TypeChooser. I'll check what they do in typechecked. (I did already remove the special code for variables as I found that was unnecessary.)

I still think casting to Object should be caught and interpreted as "not an array", in this context, as otherwise OBJECT_TYPE = DYNAMIC_TYPE (in ClassHelper) which gets us nowhere if the coder's weeping at his keyboard crying "No, you moron, I'm telling you it's not an array!"

As for just making it emit compiler warnings, no, it wasn't discussed. To discuss it now I would say:

What would the coder be expected to do as a result of seeing such a warning? Maybe to put in a cast or coerce, or make sure a property/variable etc. was typed so they didn't need to... But this change is the change needed so that doing that would even work. At the moment, casting to a non-array type is completely ignored. Declared types are completely ignored. A null is always a null array no matter what you do to try to indicate otherwise. Your only recourse is to put it in an array yourself, like varargs was never invented. And of course if you, writing your code, knew for an absolute fact that a given expression would never return a null, and therefore you'd never hit this problem, those compiler warnings would get annoying real fast.

The expressions in question are known non-arrays. If they happen to evaluate to non-null values they will currently certainly be wrapped in an array at runtime (ParameterTypes::fitToVargs which is also where the "null is always an array" decision is enshrined). They'd have to or you'd be getting missing method exceptions when it tries to invoke the method with the bare non-array type. How is it not a bug that something else happens if those expressions happen to evaluate to null? I can't seem to stress or repeat enough that we're only doing anything to expressions whose types are known to be non-arrays to a high degree of confidence, where it simply cannot be right that they ever get passed straight to an array-typed parameter. Dynamically typed expressions, where it's not obvious whether it should be wrapped or not, are left well alone.

If anything there should be compiler warnings if they try to apply a dynamic-typed expression to a varargs method. But again, what are they going to do about it? Put a cast in to say "it's an Object" or "It's an array". Again: This is the change that means that doing that actually accomplishes anything at all.

(And I still think this should be fixed for dynamic as well 😉 - It's still a crime that you can't push it the right way with an explicit cast at least. It's just harder to be sure you're even in a varargs context there.)

Jul 06 '21 15:07 StrangeNoises

Checked: Using the TypeChooser doesn't work for MethodCallExpression or PropertyExpression. Therefore left my version in for now.

Reason appears to be that our TypeChooser is a StatementMetaTypeChooser and not, as you're presumably expecting because it looks like it might work, a StaticTypesTypeChooser.

In turn, our WriterController is a plain base-class WriterController and not, again as you were presumably expecting, a StaticTypesWriterController.

(Both logged with getClass().toString(), not instanceof seeing as in each case the StaticTypes variant is a subtype of the other.)

This is when compiling the method testVarargsOnlyTypeChecked in VarargsMethodTest.groovy. That method is annotated @groovy.transform.TypeChecked, and the metadata is in place from earlier in the compilation process (that's how my code is working).

I'll poke around and see if I can grok what's happening there, if it looks like a bug. Idly wondering if it's because only the method is annotated @TypeChecked and not the whole class... I wonder what else is quietly not working and not enough people are using it to notice. 😉

Also as I understand it the StaticInvocationWriter is for @CompileStatic only. not @TypeChecked - could it be simply that the others are only used for that too? Deliberately or not...

New push is just a forced rebase against upstream master, not that there's much new in the last week. I'm afraid I set that off before I actually identified the above.

Jul 10 '21 11:07 StrangeNoises

It looks like you'd need the StaticTypesTransformation::visit method to take after its subclass StaticCompileTransformation::visit and actually set the WriterControllerFactory on the ClassNode, as well as a flag to say that this class/method is @TypeChecked (as per the subclass setting one indicating it's @CompileStatic).

Of course you don't have this at the moment because, once the type-checking is done, you want the code to be actually generated in the same dynamic fashion as normal. But that's why we don't have the StaticTypesTypeChooser here. 😀

The hypothetical StaticTypeCheckedWriterController probably wants to be a superclass of the current StaticTypesWriterController (which maybe ought to be called StaticCompileWriterController, these StaticTypes* classes actually pertaining to static compile confused me for a while) so that the latter can still defer upwards through the former, so you can still have classes with some methods @CompileStatic and some methods @TypeChecked.

This StaticTypeCheckedWriterController wouldn't actually have much, just the StaticTypesTypeChooser and a StaticTypeCheckedInvocationWriter... which is where my varargs fix would actually go. And anything else we can think of where information known because we're in a @TypeChecked context actually wants to influence how code is generated.

... if only we could think of one other thing to make all this refactoring worthwhile... 😉 My issue, generalised, is that we'd want to cover situations when the type of an argument expression is significant even if the expression yields a null value. But the only other use-case I've so far come up with for that is matching to the correct overloaded method with a null argument, and that would still need a static invocation to resolve... unless we can send the list of argument types into the callsite... 😅

Jul 12 '21 11:07 StrangeNoises

-1 for me

I will look over your tests and see about adding them to the build at least to demonstrate GROOVY-10099.

InvocationWriter#loadArguments already has significant checking/handling for variadic situations. And null is supposed to represent the array, not an array with one null element. I think the safe advice is to use @CompileStatic if you have a scenario that passes a single nullable argument for the array parameter and you don't want to wrap in your code.

Aug 23 '22 16:08 eric-milles

groovy groovy copied to clipboard

GROOVY-10099: Resolve ambiguous varargs behaviour in dynamic mode by recourse to compiler-stage information

groovy
groovy copied to clipboard