This PR adds a proper AST that the parser constructs before emitting actual Thorin code.

While this adds some boilerplate it also reduces a lot of weird hacks/workarounds in the parser as we can now do one thing at a time and we don't have to do everything from left to right; we can traverse the AST as many times as we want and in any order we want. Finally, I want to add in a future PR a proper Thorin -> Thorin-AST decompilation for a proper output.

Rn, the pipeline looks like this:

lex/parse to AST
bind: associates ast::IdExpr with its ast::Decl and does some other minor semantic checks
emit: AST -> Thorin graph

Changes in the Frontend

Parentheses for filters are not mandatory anymore: fn (a: A)@(.tt): B -> fn (a: A)@.tt: B
Removed ! for specifying filters. There were only a handful of !s in our code base that we actually needed. Changing those few instances to @.tt isn't a big deal and having two ways for specifying filters just adds confusion and code complexity on our side.
Declaration expression: d* e This makes block expressions superfluous as you can simply use a (possibly parenthesized) declaration expression instead. For this reason, the parser emits a warning that block expressions are deprecated. On top of that, you can use declaration expressions in other useful occasions like this:
```
 .fun .extern f(T: *,
                U: .let F = %foo.F T; F,
                V: F
              )(W: F): F = /*...*/;
```

Where expression: e .where d* .end This is a reverse declaration expression: The declarations are bound first (in reverse order), than the expression is bound. Example:

.fun .extern main(mem: %mem.M, argc: .I32, argv: %mem.Ptr0 (%mem.Ptr0 .I8)): [%mem.M, .I32] =
  loop (mem, 0I32, 0I32)
  .where
      .con loop(mem: %mem.M, i: .I32, acc: .I32) =
          .let cond = %core.icmp.ul (i, argc);
          (exit, body)#cond mem
          .where
              .con body m: %mem.M =
                  .let inc  = %core.wrap.add 0 (1I32, i);
                  .let `acc = %core.wrap.add 0 (i, acc);
                  loop (m, inc, acc);
              .con exit(m: %mem.M) = return (m, acc);
          .end
  .end

.Pi F: T = Π /*...*/ -> .rec F: T = Π /*...*/
.Sigma S: T = [/*...*/] -> .rec S: T = [/*...*/]
In addition: .rec f = .lm /*...*/ = lam f /*...*/. Same for
- .cn/.con
- .fn/.fun

.and clause for mutual recursion:

.lam is_even(i: .Nat): .Bool = (is_odd (%core.nat.sub (i, 1)), .tt)#(%core.ncmp.e (i, 0))
.and
.lam is_odd (i: .Nat): .Bool = (is_even(%core.nat.sub (i, 1)), .ff)#(%core.ncmp.e (i, 0));

Or for mutual recursive types:

.rec T = ... U ...;
.and U = ... T ...;

No more forward decls for .lm/.lam, .cn/.con, .fn/.fun or types. Use .let, .where, .rec, .and instead.

For declaring C-Functions that you want to link later on use new .ccon/.cfun decl:

.cfun print_str[%mem.M, %mem.Ptr0 «⊤:.Nat; .I8»]: %mem.M;

Or more low-level:

.ccon print_str[[%mem.M, %mem.Ptr0 «⊤:.Nat; .I8»], .Cn %mem.M];

New output: ./thorin --output-ast - that directly emits the AST representation. Here is still work to do but I will address this in future PRs.

Other Changes

thorin::fe -> thorin::ast
improved error messages
- colored output
- compilation proceeds on lexing, parsing, and bind (see above) errors
more streamlined way to set Loc of thorin Defs in emit
annex/bootstrapping magic moved to AST infrastructure
bug fixes + other minor refactoring here and there

Mar 30 '24 15:03 leissa

@NeuralCoder3, @fodinabor: bump

May 07 '24 18:05 leissa

For declaring C-Functions that you want to link later on use new .ccon/.cfun decl:

Why is this necessary? If we no longer have/need forward declarations, why is not every function without a body an external declaration?

May 15 '24 14:05 NeuralCoder3

For declaring C-Functions that you want to link later on use new .ccon/.cfun decl:

Why is this necessary? If we no longer have/need forward declarations, why is not every function without a body an external declaration?

This way, we can better restrict and control what an external declaration is. E.g., we can't have a curried function or a function taking type variables as arguments as external declaration. Note that ccon/cfun sole purpose is to establish a FFI to C/LLVM. I agree that this is a bit ugly at the moment, but right now, it makes the front end simpler and we can still migrate to a more fancy solution - once we have figured out how exactly we want to design our module system.

May 15 '24 20:05 leissa

Ast

Changes in the Frontend

Other Changes