thorin2
thorin2 copied to clipboard
Ast
This PR adds a proper AST that the parser constructs before emitting actual Thorin code.
While this adds some boilerplate it also reduces a lot of weird hacks/workarounds in the parser as we can now do one thing at a time and we don't have to do everything from left to right; we can traverse the AST as many times as we want and in any order we want. Finally, I want to add in a future PR a proper Thorin -> Thorin-AST decompilation for a proper output.
Rn, the pipeline looks like this:
- lex/parse to AST
- bind: associates
ast::IdExpr
with itsast::Decl
and does some other minor semantic checks - emit: AST -> Thorin graph
Changes in the Frontend
-
Parentheses for filters are not mandatory anymore:
fn (a: A)@(.tt): B
->fn (a: A)@.tt: B
-
Removed
!
for specifying filters. There were only a handful of!
s in our code base that we actually needed. Changing those few instances to@.tt
isn't a big deal and having two ways for specifying filters just adds confusion and code complexity on our side. -
Declaration expression:
d* e
This makes block expressions superfluous as you can simply use a (possibly parenthesized) declaration expression instead. For this reason, the parser emits a warning that block expressions are deprecated. On top of that, you can use declaration expressions in other useful occasions like this:.fun .extern f(T: *, U: .let F = %foo.F T; F, V: F )(W: F): F = /*...*/;
-
Where expression:
e .where d* .end
This is a reverse declaration expression: The declarations are bound first (in reverse order), than the expression is bound. Example:.fun .extern main(mem: %mem.M, argc: .I32, argv: %mem.Ptr0 (%mem.Ptr0 .I8)): [%mem.M, .I32] = loop (mem, 0I32, 0I32) .where .con loop(mem: %mem.M, i: .I32, acc: .I32) = .let cond = %core.icmp.ul (i, argc); (exit, body)#cond mem .where .con body m: %mem.M = .let inc = %core.wrap.add 0 (1I32, i); .let `acc = %core.wrap.add 0 (i, acc); loop (m, inc, acc); .con exit(m: %mem.M) = return (m, acc); .end .end
-
.Pi F: T = Π /*...*/
->.rec F: T = Π /*...*/
-
.Sigma S: T = [/*...*/]
->.rec S: T = [/*...*/]
-
In addition:
.rec f = .lm /*...*/
=lam f /*...*/
. Same for-
.cn
/.con
-
.fn
/.fun
-
-
.and
clause for mutual recursion:.lam is_even(i: .Nat): .Bool = (is_odd (%core.nat.sub (i, 1)), .tt)#(%core.ncmp.e (i, 0)) .and .lam is_odd (i: .Nat): .Bool = (is_even(%core.nat.sub (i, 1)), .ff)#(%core.ncmp.e (i, 0));
Or for mutual recursive types:
.rec T = ... U ...; .and U = ... T ...;
-
No more forward decls for
.lm
/.lam
,.cn
/.con
,.fn
/.fun
or types. Use.let
,.where
,.rec
,.and
instead. -
For declaring C-Functions that you want to link later on use new
.ccon
/.cfun
decl:.cfun print_str[%mem.M, %mem.Ptr0 «⊤:.Nat; .I8»]: %mem.M;
Or more low-level:
.ccon print_str[[%mem.M, %mem.Ptr0 «⊤:.Nat; .I8»], .Cn %mem.M];
-
New output:
./thorin --output-ast -
that directly emits the AST representation. Here is still work to do but I will address this in future PRs.
Other Changes
-
thorin::fe
->thorin::ast
- improved error messages
- colored output
- compilation proceeds on lexing, parsing, and bind (see above) errors
- more streamlined way to set
Loc
of thorinDef
s inemit
- annex/bootstrapping magic moved to AST infrastructure
- bug fixes + other minor refactoring here and there
@NeuralCoder3, @fodinabor: bump
For declaring C-Functions that you want to link later on use new .ccon/.cfun decl:
Why is this necessary? If we no longer have/need forward declarations, why is not every function without a body an external declaration?
For declaring C-Functions that you want to link later on use new .ccon/.cfun decl:
Why is this necessary? If we no longer have/need forward declarations, why is not every function without a body an external declaration?
This way, we can better restrict and control what an external declaration is. E.g., we can't have a curried function or a function taking type variables as arguments as external declaration. Note that ccon/cfun
sole purpose is to establish a FFI to C/LLVM. I agree that this is a bit ugly at the moment, but right now, it makes the front end simpler and we can still migrate to a more fancy solution - once we have figured out how exactly we want to design our module system.