move icon indicating copy to clipboard operation
move copied to clipboard

Macros

Open wrwg opened this issue 2 years ago • 3 comments

(This is a port of pending Diem PR #96 from the diem/move repo; see latest discussion there.)

This PR realizes the 1st step towards an alternative, high fidelity implementation of user level macros, in contrast to compiler macros suggested in Diem PR #9930.

In this model, macros can be declared using a new style of function declaration:

    public macro fun foreach<T>(v: &vector<T>, action: |&T|) {
        let i = 0;
        while (i < Vector::length(v)) {
            action(Vector::borrow(v, i));
            i = i + 1;
        }
    }

This PR implements parsing and type checking of this style of macros up to the typing AST. Macro expansion is not yet implemented and intended to be added in a followup.

The biggest change is enabling function types, lambda expressions, and calls of function values for naming and typing. Those had been supported before for the spec language, but weren't supported in move-lang beyond expansion. In the design, we represent function types by a new BuiltinTypeName_::Fun which allows to minimize changes to the code, as no new type variant needs to be introduced, and existing functionality for inference et. al can be adopted with minimal changes. This consistent e.g. with the treatment of function types in Rust, where they are just type constructors with some syntactic sugar.

A new test case has been added at lambda.move which shows usage and intends to cover all type check failure scenarios.

After this PR, to complete this style of macros, we still need to:

  • Implement actual expansion of macros. This is intended to happen in a new phase after typing.
  • Implement representation of macros in compiled, persisted modules. We need to be able to fetch the macro definition from an imported, precompiled module.
  • Implement techniques (in the prover) to refer to specifications of passed lambda arguments to macros. (See https://github.com/diem/move/blob/main/language/move-prover/tests/sources/functional/macro_verification.move for what we want to do here.)

Motivation

Macros

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

Added new tests

wrwg avatar Apr 12 '22 03:04 wrwg

@bmwill Awesome, CI seems to work.

wrwg avatar Apr 12 '22 03:04 wrwg

@wrwg unfortunately it does not yet

bmwill avatar Apr 12 '22 04:04 bmwill

Overall, I like where this is going. Big fan of the |T, T, T|T syntax! I like syntax where the type is matches the expression like this! (I was, and still am, always bothered by the syntax soup around lambdas in Rust)

I have some concerns about the usability of the macros, just in how it lines up with the rest of the system. Maybe, @wrwg, if you could walk me through you you see this working out that would be helpful :)

To be clear none of these are blocking landing. We don't have to solve all these today. I just want to make sure we are set up for being able to solve these in the future (which might influence this PR)

1. Type parameters don't normally take references or "tuples"

At the bytecode level, references cannot instantiate type parameters. And at the bytecode level, these "tuples" don't even exist. So in the source language, every type parameter has to be instantiated with a single non-reference type. This will be unnecessarily restrictive, and kind of annoying for some of these macros. For example I imagine it would be nice for Vector::reduce to have the accumulator be a "tuple", e.g.

let v: vector<u64> = ...;
let (num_even, num_odd) = Vector::reduce(
    v, 
    (0, 0), 
    |x, (even, odd)| if (x % 2 == 0) even = even + 1 else odd = odd + 1,
});

But this gets a bit tricky if we just remove the single type constraint from macro type parameters in that we need to be sure it is still applied in the appropriate spots (which I think it should happen automatically, even in cases like macro fun foo<T>(v: vector<T>, f: ||T) and foo!<(u64, bool)>(...))

My bigger concern really is, do you think this will be confusing to programmers? (That the rules for type parameters are different for macro vs non-macro funs)

I think this will be ok if our error messages are good enough.

I wonder whether we already considered a different (not uncommon) approach to the tuple problem. Let them be just some up to a given N predefined value structs (say N=12). The type and pattern notation is then just syntactic sugar. The return value of a function, if it is a tuple, may then have some optimized compilation to exploit direct VM support.

2. Substitution and shadowing

There is a whole can of worms once we try to expand the macro. The two that I'm most concerned about are A. shadowing macro parameters B. "captured" local variables in lambdas

For (A), I see two possibilities, we could either have a check that says "You cannot shadow or assign macro fun parameters" or we could put $ in front of macro parameters so they are in a different name space from other variables

For (B), I think we should probably just not allow capturing of locals in these lambdas in the short term? I think it will just make the macro expansion a lot cleaner and we can always come back to later

I actually think this is not that problematic, because after all, we do not have Rust macros but just regular higher-order functions, only that we expand them on bytecode level.

A. Where the lambda body is substituted, all locals in the body need to be renamed away from all the locals in the context. On Move level, regular static scoping will apply for the lambda.

B. Capturing is fine, because the lambda is only substituted downwards of the invocation point of m!(|x|e), inheriting the context of exactly this expression. The lambda always appears as a parameter. One cannot say let f = |x|e; m!(f). Thus basically, capturing comes for free via substitution (A).

There is a problem C: Borrow errors if a captured value is consumed in a loop defined by the macro. Perhaps at some point we need to have better support for errors stemming from macro body expansion.

3. Receiver syntax inside lambdas

I think this is a non-issue as long as we restrict macro lambda usage heavily (which I think we have to anyway), but I'll write it up anyway.

Normally, when type checking the body of lambdas you have that you don't really want to type check the body until the lambda is "used". This is so you can be more permissive in not requiring annotations. For example,

I think for fun macro we should assume they are fully independently and eagerly type checked. Basically just higher order functions which we happen to expand.

We may want to add other kind of macros down the road where this is not the case.

let v: vector<MyStruct> = ...;
let f = |x| x.foo();
some_cool_macro!(x, f)

If you type check |x| x.foo() when it is first seen, you have to error since you don't know what x is. But if you wait until after it is "used" (like in the call some_cool_macro), you can give a better programming experience.

I think this shouldn't be an issue because we likely won't support anything other than using the lambda in the macro call, e.g. let v: vector<MyStruct> = ...; some_cool_macro!(x, f)

4. Receiver syntax inside macro funs

Where did we land on type checking the body of macro funs? I remember discussing this particular issue but don't remember where we landed.

Fully type checked, currently.

It could be annoying if we don't (since you might see an error over and over again like you do in Rust macro expansions) But conversely it would be more expressive if we don't! That way you could write macros such as

macro fun call_foo<T>!(v: &vector<T>) {
    Vector::foreach!(v, |x| x.foo())
}

As I write this out, I think you will want to expand the macro during type checking, which will I think mean you cannot type check the macro fun before it is used everywhere.

But let me know your thoughts

5. Errors in macro expansion

I think a managable concern but just something I thought of:

I think we want to make sure we add a note or label of some sort to any diagnostic that happens inside of the code of an expanded macro. We could maybe accomplish this by adding a node UnannotatedExp_::ExandedMacro(Box<Exp>, /* inside of */ (ModuleIdent, FunctionName)) Or something similar, but definitely doesn't have to be done right now :)

Absolutely!

wrwg avatar Apr 18 '22 03:04 wrwg

@tnowacki @sblackshear @meng-xu-cs @rahxephon89 @junkil-park @runtian-zhou

Folks, this resurrects the macro PR from April of this year. Macros are really, really important for the Move prover, so I'd like to push this forward before the end of the year.

I think most of the design we discussed back in April is still valid. I would probably suggest to discuss whether we really call those guys macros and instead preserve this name for something more on a lexical level. I personally still like the name 'template' for what is implemented here.

wrwg avatar Nov 23 '22 04:11 wrwg

PS. Also want to add that after discussion also with @runtian-zhou I do not see an overlap with the interface proposal #449. Interfaces serve a different equally valid purpose. Macros + Interfaces together make traits unnecessary.

wrwg avatar Nov 23 '22 04:11 wrwg