encore
encore copied to clipboard
RFC: Syntactic sugar for convenient operators
Taking a leaf out of Python's book, I propose (as we have discussed recently at some weekly meeting), that we support a mapping from some standard operators to methods with special names.
-
expr1[expr2]
desugars intoexpr1.load(expr2)
-
expr1[expr2] = expr3
desugars intoexpr1.store(expr2, expr3)
-
expr1 < expr2
desugars intoexpr1.lt(expr2)
-
expr1 > expr2
desugars intoexpr1.gt(expr2)
-
expr1 <= expr2
desugars intoexpr1.le(expr2)
-
expr1 >= expr2
desugars intoexpr1.ge(expr2)
-
expr1 == expr2
desugars intoexpr1.eq(expr2)
-
|expr|
desugars intoexpr.size()
For []
, I propose that we allow any type signature. That will
allow a user-space implementation of maps with familiar syntax.
For <
etc. operators, I propose that we require that the type of
both operands are the same and require the return type to be bool
,
i.e., in a trait T
, the only permitted signature of ge
is:
def ge(other : T) : bool
Similarly, we could allow expr1(expr2)
to desugar to
expr1.apply(expr2)
when expr1
returns an object, although I am
not convinced we gain more than we add confusion.
op type signature desugars to
---------------------------------------
[] : t1 -> t2 -> t3 | load
[]= : t1 -> t2 -> t3 -> t3 | store
< : t -> t -> bool | lt
> : t -> t -> bool | gt
<= : t -> t -> bool | le
>= : t -> t -> bool | ge
== : t -> t -> bool | eq
|| : t -> int | size
---------------------------------------
UPDATE The return type of []=
will be unit
if assignments are statements. If they are not statements, we should add something (at some point) to protect against misspelled comparisons that accidentally turn into assignments.
Update, the []
and []=
forms should be variadic, i.e., we should support
foo[x, y] = z
etc.
- The form
expr1[expr2]
could clash with type parameter syntax: Ifx
is an array of closures that takeint
:x[0](123)
- The store expression should probably return
void
Thanks for the feedback! Re your second point, that's what I first thought too. But it seems strange that loops etc. have a return value and stores don't.
With respect to your second point, I completely agree. Here is what I wrote about this in the Slack channel (now obscured by subsequent messages) yesterday:
One interesting aspect of changing the syntax of type parameters to square brackets instead of angle brackets is that
foo[i](j)
can now, from the parser's perspective, be both a function call to a function namedfoo
with type parameteri
and argumentj
, and an function call on some unnamed closure in elementi
in the arrayfoo
with argumentj
.Once we make this change, we will have to use types to disambiguate this, or find alternatives. Changing the syntax of passing type parameters to a function, for example. But I don't know what a good alternative is.
An alternative would be to use call syntax: x(0)
. It makes an array look like a function that accepts an index.
The reason for []
is of course that it's more popular. But if it leads to ambiguities...
Re your second point, that's what I first thought too. But it seems strange that loops etc. have a return value and stores don't.
I don't mean to sound stubborn: but it really doesn't seem strange to me at all. A loop produces something (including side effects), an assignment consumes something.
The reason for [] is of course that it's more popular. But if it leads to ambiguities...
It does not lead to any real ambiguities. However, parsing will become more convoluted.
Since types start with a Capital letter, and variables/methods won't, it will be pretty easy to see when the operand of []
is a type or not. The exception is type variables, but now we are talking very rare code IMO. Also, it is very likely that type parameters will not be needed in a majority of cases because of type inference.
For me, the reason to go with an expression-based syntax is that it is enabling. Is it so bad to let assignments return a value? (I don't feel very strongly about this, but I'm curious.)
The 'canonical' justification for making assignments void is that: if (x = true) then ... else ... end
should not typecheck.
This typo is much more common in languages with weird semantics of booleans (like C), but the fact that it would be extra unlikely to happen in encore would make it extra hard to spot (you're off guard).
That hole seems easily pluggable by not allowing assignments in guards. (At least not in strict mode.) It would be better IMO to reject the program because of the probability that this is not what the programmer means, rather than a type error.
Also, if ==
is is
...
Also, if == is
is
...
That'd make it better. I still think it's strange that assignments return things -- but that's a weak argument.
Does that mean that the table above (which contains ==
) will change again? In a separate issue, I suppose?
I don't know if it will change. That's something I want to discuss soon (maybe this Thursday). And yes, separate issue.
On one hand, I like keywords over ASCII symbols. And it is nice if the eq
operator desugars to a .eq()
call. On the other hand, if x == y then ...
feels more readable than if x eq y then ...
because of how symbols are visually distinct from characters. Now that we've abandoned {
and }
, this becomes more important IMO.
I generally agree that keywords are nice. But liking keywords does not mean that we have to use them everywhere, always. I'll say that again whenever we discuss it for real.
+1 the question is just where (not)
The 'canonical' justification for making assignments void is that:
if (x = true) then ... else ... end
should not typecheck.
I'm not sure whether this is an argument for making assignments expressions (with void or unit type) or against it. Assignments won't have boolean type.
BTW: is there a difference between void type and unit type in Encore (or future Encore)?
I'm not sure whether this is an argument for making assignments expressions
I don't think Stephan cares -- his main point is that stuff like that should not work. Originally, we were discussing the convenience syntax x[y] = z
which desugars to x.store(y, z)
. The latter is clearly an expression, and the question is whether is should return unit
or something else (like return z).
Assignments won't have boolean type.
We are discussing the pros and cons of making assignments expressions, especially in the context of them being desugared into different syntactic forms. If, as I proposed above, x[y] = z
is desugared into the method call x.store(y, z)
, then (if we allow it) the type may well be boolean (or a Maybe[t]
, which'd suffer the same problem). It won't be a statement.
is there a difference between void and unit
I don't think so, modulo syntax ofc. I think you wanted unit because of Haskell. For me it has been a search and replace.
I suggest we move the whether assignments are expressions or statements to a different thread because this really orthogonal to the RFC above. I am updating it with a note saying that []=
will have a type signature that makes x[y] = z
work like an assignment (i.e., if assignments are statements this should return unit
, etc.).
Regarding the clash between [x]
for type parameters and for array access, is it possible that a type T[x]
and a term e[x]
can appear in the same place, syntactically?
I don't think so. But T[x](e)
and e[x](e)
could (as already pointed out).
What does expression T[x](e)
correspond to, where T[x](e)
is a type?
Since we are dropping new, List[T]()
is creating a list of Ts, and list[t]()
is getting the teeth element of list, which is a closure, and calling it.
And the issue is that this syntactic ambiguity cannot be resolved until type checking, right? And to avoid unnecessarily complicating type checking, we'd rather not do such resolution in type checker?
And the issue is that this syntactic ambiguity cannot be resolved until type checking, right?
Yes.
And to avoid unnecessarily complicating type checking, we'd rather not do such resolution in type checker?
I honestly don't mind that part. It makes things a tad more complicated, but not much.
There is a little of that sort of type-based resolution in the compiler already (at least, in the modules branch). It is relatively simple to do so long as you can locally work out the types of the various components (sub-expressions) involved. Type inference might make that more complicated in some cases (though probably not here), because one doesn't locally know the solutions to the typing constraints.
That sounds great!
Will these operators be bundled together in one or more traits? These traits are likely to need F-bounded polymorphism to be useful. Will we support that?
The operators are like the methods for the for comprehensions. If store()
is present in class C (by way of one of its traits, no matter which one) -- then you can write let c = C() in c[42] = bar
. No requirement to implement all operators or specific subsets.
Unless someone has a better idea.
The same one I had for Stephan's comprehensions, namely, to use traits.
Without traits, you will need some kind of ad hoc structural polymorphism, or no polymorphism.
If we end up having polymorphism that's powerful enough, such traits might make sense here. But I doubt that we'll have that polymorphism. What Tobias proposed seems like a good tradeoff between effort and convenience. This is more like operator overloading.
For one, and AFAIK, we don't plan to have traits that have access to a Self
type. This type would be needed here.
Or Traits with F-bounded polymorphism.