ceylon icon indicating copy to clipboard operation
ceylon copied to clipboard

infer anonymous fun parameter type from usage

Open gavinking opened this issue 7 years ago • 12 comments

As suggested by @jensli in #7351, we could enhance anonymous function parameter type inference to take usages into account. For example, in:

(n) => Integer.format(n, 16)

The parameter n would be inferred to have type Integer.

Quoting myself:

If the parameter is only used as a function argument, and never:

  • as the receiver of a method invocation, nor
  • in an operator expression, nor
  • as the argument of an overloaded (Java) method,

then we could reasonably infer its type from how it's used.

gavinking avatar Apr 15 '18 10:04 gavinking

I've pushed in initial implementation of this idea to the branch 7353 and it appears to work well. The actual implementation was pretty straightforward.

The question I have is whether stuff like the following should be accepted or rejected:

(n) => Integer.format(n+100, 16) + Integer.format(n, 16) //currently rejected
(n) => Integer.format(n, 16) + Integer.format(n+100, 16) //currently accepted

gavinking avatar Apr 15 '18 15:04 gavinking

Also, I'm concerned about the error message for something like this:

"hello".map((c)=>Integer.format(c, 16))

Which now works out as:

argument must be assignable to parameter 'collecting' of 'map' in '{Character*}': 'String(Nothing)' is not assignable to 'String(Character)'

whereas previously it would have been a much more meaningful error saying that Character is not assignable to Integer.

I should maybe turn this new feature off for the cases where the parameter type can be inferred by other means. Or maybe refuse to infer Nothing. Or something, I'm not sure.

gavinking avatar Apr 15 '18 15:04 gavinking

I realized that we can also do some good stuff with some operator expressions, for example:

Float(Float) f = (x) => x/2.0;
Boolean(Float) g = (x) => x>2.0;
Boolean(Boolean,Boolean) h = (p,q) => !p || q;

I've implemented this sort of inference for the operators:

+ - * / % ! || && ** .. < > <= >= <=> == != === = += -= *= /= %= ||= &&=

gavinking avatar Apr 15 '18 18:04 gavinking

And now it's also aware of use in boolean conditions, for example:

Integer(Boolean) toInt = (b) => if (b) then 1 else -1;

gavinking avatar Apr 15 '18 18:04 gavinking

I now also take into account specifier statements and value initializers.

gavinking avatar Apr 15 '18 19:04 gavinking

@gavinking

The question I have is whether stuff like the following should be accepted or rejected:

Why shouldn't both be accepted? format(n,...) implies n should be Integer, and format(n+100... implies n should be Summable (or Numeric, not sure...) In any case, the union for both restrictions (Integer & Summable) should generate the same type (Integer), that is the right type for n, isn't it? As long as union is commutative, the order of restrictions should not matter, IMHO.

someth2say avatar Apr 16 '18 08:04 someth2say

Amazing work, as usual, but I've bumped into a problem while trying out this branch.

Integer(String) howLongIs = (s) => s.size;

Does not compile:

method or attribute is not defined: 'size' in type 'Anything' might be misspelled

My workaround is...

Integer(String) howLongIs = (s) => Object.string(s).size;

fwgreen avatar Apr 17 '18 01:04 fwgreen

@someth2say

Why shouldn't both be accepted?

Well it's now a bad example, since I implemented inference from usages in operators like n+100, meaning that both of the examples are now acepted.

What I was trying to get at is that there's sort of a problem with the way I've implemented this whole feature. If we're going to do type inference in one pass, which is one of the main principles of how we type check things in Ceylon, then at later points in the body of the function, we have more information about the type of the parameter. Consider:

(seq) {
    print(seq.size); //error
    String[] strings = seq;
    print(seq.size); //ok
}

It seems to me that there's no particularly great way to fudge it so that both occurrences of seq.size are treated similarly. I definitely don't want to go down the path of doing multiple passes of the function body, or of setting up some sort of constraint-solving algorithm, since, ultimately, this is an OO language, and we'll never be able to do full HM-style type inference here.

One thing I could do would be to mark parameters like seq as "dirty" and disallow all usages (such as seq.size) where the usage itself does not determine the type. Then both occurrences in the example above would be rejected. On the other hand, that seems a little heavy-handed and confusing to the user.

gavinking avatar Apr 17 '18 09:04 gavinking

@fwgreen

Amazing work, as usual, but I've bumped into a problem while trying out this branch.

Integer(String) howLongIs = (s) => s.size;

Does not compile.

Right, so right there you've into the most basic limitation of type inference in an OO language. There is simply no way to infer what foo.bar() means, without first having information about the type of foo. The difference with languages like Haskell and ML is that we always know what bar(foo) means because one must explicitly import bar at the top of the file. In OO, a foo implicitly carries around a bunch of functions with it, with no need to import them explicitly, and with pervasive overloading where foo.bar() and baz.bar() refer to completely different bars.

Now, in your specific example, it's true that we could easily enough realize that string is always Object.string, since with string being declared by Object, nobody else can give it an alternative overloaded definition. But that doesn't work for any symbol other than string, equals, and hash.

gavinking avatar Apr 17 '18 10:04 gavinking

So, look, I'm questioning whether it's really a good idea to push this into the language:

  • if we were to implement #7058, and finish the implementation of #6615 (i.e. add parameter type inference for immediately-invoked anonymous functions), and given that #7251 has already been solved, we would already have eliminated most of the cases where we need to write in parameter types of anonymous functions today,
  • but this proposal, while very cool to see in action, complicates the language somewhat, without handling all interesting remaining cases, and
  • results in much worse error messages in a nontrivial set of cases where it fails.

So unless you guys can find me a really compelling example or set of examples which this proposal improves, I guess I'm going to leave it unmerged for now.

Thoughts?

gavinking avatar Apr 17 '18 10:04 gavinking

Sadness when you implemented such a great thing in such a short time :)

I strongly feel that having complex error messages that can be made more readable just by temporarily adding explicit type parameters is of great annoyance. I had to do this a lot in Java when I was programming asynchronous code and it just feels like you have to babysit the compiler. In the end you typically don't want to keep those explit type declarations there since your brain can understand.. and then of course the IDE editor indicates that you are being sub-optimal by being explicit..

Otoh it would be a little less annoying in Ceylon I suppose since you don't end up having to change s -> s.foo() to (Foo s) -> s.foo() but instead just (s) => s.foo() to (Foo s) => s.foo() i.e. you don't have to fiddle back and forth with the parenthesis :)

argument must be assignable to parameter 'collecting' of 'map' in '{Character*}': 'String(Nothing)' is not assignable to 'String(Character)'

In this particular case I don't directly see why it could not at least deduce String(Integer) instead of String(Nothing) since I guess that's what the type inferrence is supposed to actually work out for you. If that could be improved the error message would already be aclearer. Maybe not clear enough. :man_shrugging:

xkr47 avatar Apr 17 '18 22:04 xkr47

Given the work on #7058 and #6615, I feel like the pressure for this feature is significantly alleviated.

gavinking avatar Apr 23 '18 08:04 gavinking