mgmt icon indicating copy to clipboard operation
mgmt copied to clipboard

line numbers in the AST re #732

Open ffrank opened this issue 1 year ago • 20 comments

I wrote a suggestion of how to store the mcl code coordinates of notable AST nodes in the parser. (It's not done for each last node.)

The approach is possibly flawed; the amount of copy/paste makes me sad.

This was still the easier part though. Next I was trying to figure out how to generate an error message during type unification, that indicates the == operator in the following snippet:

$a = 1
$b = "one"

if $a == $b {
        $c = 3
}

But the unification approach that collects all invariants, and then loops over them, makes this very hard, maybe impossible.

Will we need to keep references to the AST nodes in the invariants? So that we can trace where problematic invariants originate?

ffrank avatar Mar 23 '24 19:03 ffrank

the amount of copy/paste makes me sad.

I expect some copy-paste, that's to be expected.

during type unification

Don't worry about this at the moment. I have some structural changes which are coming and I don't know if this will break anything that happens there yet, so TBD.

It's not done for each last node.)

Just a POC is fine!

purpleidea avatar Mar 23 '24 19:03 purpleidea

PS: Note there is also the %error stuff in parser-- that works well enough. Line numbers in the code is the goal =D

purpleidea avatar Mar 23 '24 19:03 purpleidea

In the latest commit 913c100 the code essentially works. It will store start and end positions in nodes types that have received the TextArea embed.

This does not survive interpolation though, unless Interpolate is adapted as demonstrated for StmtBind and StmtIf here.

I even have an idea for a type unifier that could allow us to drill down to the offensive expressions.

EDIT: after some shy attempts to implement a prototype, I want to say this idea did not pan out. Refactored unification sounds much more promising 👇🏻

ffrank avatar Mar 24 '24 21:03 ffrank

FYI: I'm currently refactoring the type unification stuff, so after that's done it will be great to revisit this and get it merged in some form. Sneak peak: The new unification has a pointer to the Expr in question so as long as the Expr has the ROW/COL information, then we'll be able to show that without doing any additional plumbing.

purpleidea avatar Apr 01 '24 17:04 purpleidea

I'm not moving this in any direction until that change lands then.

ffrank avatar Apr 13 '24 14:04 ffrank

Yeah, sorry for the delay. The start of that is here: https://github.com/purpleidea/mgmt/tree/feat/unification in case you're knowledgeable in this area and want to help. I'm kind of a terrible algorithmist and a few things are up in the air, but if you or anyone is an expert in this area, please chime in =D

purpleidea avatar Apr 13 '24 14:04 purpleidea

Boy has time flown by! I guess you know all the things I was busy with. In any case, unification has merged, so if you'd like to rebase this, that would be awesome!

It might also be very easy for you to plug things into unification error messages too (or I can help with that)

LMK thanks!

purpleidea avatar Jul 01 '24 23:07 purpleidea

Hey, nice work on the resolver, feels good so far.

I can build this branch and ./mgmt run lang error.mcl where error.mcl has

# this shall fail
$value = 1 + "a"
test "t1" { anotherstr => $value, }

The error message will be this:

cli parse error: could not unify types: unify error with: topLevel(func() { <built-in:_operator> }): type error: int != str

The trouble is that the error is reported at the definition of the + operator rather than the invocation.

Note that I did change the String function for FuncCalls to give the location. But it is not used at the moment. I have a commit in here that makes mgmt dump a string representation of the AST and it includes

call:_operator(str("+"), int(1), str("a")) @ (1 1)

(Never mind that this position is also wrong, but right now it seems more important that the resolver can report it at all.)

ffrank avatar Sep 08 '24 20:09 ffrank

(Wrong text position now also fixed)

ffrank avatar Sep 08 '24 20:09 ffrank

resolver

What's a resolver?

The trouble is that the error is reported at the definition of the + operator rather than the invocation.

All type unification errors that are found during unification are reported at the same place... We can likely improve errors a bit though.

You asked me to comment and look at this here, I need a bit more info what you're asking, sorry!

purpleidea avatar Sep 09 '24 20:09 purpleidea

Okay trying to clear it up. I meant solver of course, not resolver.

The current issue is that we get this error:

topLevel(func() { <built-in:_operator> }): type error: int != str

That is correct, but more helpful to the user will be the information that the call of this operator is 1 + "a" in line 2 column 7.

In essence, I see a need to show a "call stack". Of course, this is at type checking time, so nothing is called (is it?)

Maybe what's rather needed is a few levels of AST parents. Rather than just saying "this unnamed internal operator was given bad operands", we should include that "its parent is this expression 1 + "a" in this bind statement $value = ...". (The immediate parent will be enough in this example, but not sure if more complex code will bring a need for more.)

ffrank avatar Sep 10 '24 17:09 ffrank

Okay trying to clear it up. I meant solver of course, not resolver.

Duh now! I don't know why my mind missed that, lol

That is correct, but more helpful to the user will be the information that the call of this operator is 1 + "a" in line 2 column 7.

Yeah I agree, we probably want to hide the "ExprTopLevel" and "ExprSingleton" expressions. To do so, you'd want to wrap this x.Expr call:

https://github.com/purpleidea/mgmt/blob/3d0660559e5cc076c1c5ab438bd5543368cd4583/lang/unification/fastsolver/fastsolver.go#L138

With trueCallee from here:

https://github.com/purpleidea/mgmt/blob/3d0660559e5cc076c1c5ab438bd5543368cd4583/lang/ast/util.go#L330

I expect this will cause an import cycle, so I'd have to ponder what the best move is... I'd like to avoid needing to add a new method on Expr to get the "TrueCallee" for everyone.

Does that help?

purpleidea avatar Sep 10 '24 18:09 purpleidea

Yes the cycle prevented doing it from the solver. Spot on.

I pushed a silly little hack to do this recursion from the GAPI scope, but it does not work. This is the output now:

cli parse error: could not unify types [[ in expression: func() { <built-in:_operator> } ]]: unify error with: topLevel(func() { <built-in:_operator> }): type error: int != str

So the result from trueCallee still does not show information.

This makes sense to me. As said, it's true that it's the + operator that ultimately is affected. But we don't care. We are interested in its AST parent(s).

ffrank avatar Sep 10 '24 21:09 ffrank

You want to unwrap the x.Expr part...

If you see this as your error:

topLevel(func() { built-in:_operator }): type error: int != str

Then TrueCallee(_) would return:

func() { built-in:_operator }: type error: int != str

If that's not what you want, I am lost. But I don't see how this error is not correct. What do you expect it to print?

purpleidea avatar Sep 10 '24 23:09 purpleidea

I may be barking up the wrong tree, going on about the AST.

But the issue (in terms of AST nodes) is that this type error is reported on the ExprFunc (i.e. the declaration) of the + operator, which will never be helfpul. I need it reported on the ExprCall representing (in this case) 1 + "a".

Remember, ultimately we want to report the line and column number of the code that caused the unification issue.

ffrank avatar Sep 11 '24 10:09 ffrank

the issue (in terms of AST nodes) is that this type error is reported on the ExprFunc (i.e. the declaration) of the + operator, which will never be helfpul. I need it reported on the ExprCall representing

I agree. I don't know the correct way to achieve this at the moment, but I will dig into it if I can.

Remember, ultimately we want to report the line and column number of the code that caused the unification issue.

If it reports the definition site instead of the call site for now, that's okay =D

purpleidea avatar Sep 11 '24 16:09 purpleidea

If it reports the definition site instead of the call site for now, that's okay =D

I don't believe I agree, at least until I see a typing issue that is not reported on this kind of builtin element. Why have line numbers when we cannot report errors that use them?

I had the idea to include the pointer to the entire AST node in the error, rather than just the Expr within. Then, we could run a graph search through the AST, so that we can basically "zoom in" on the problematic expression(s).

The problem is that I cannot very easily run a search through the AST. That's why I created #773 in order to have some discussion around that. (It might be good to talk about this in person a little.)

ffrank avatar Sep 12 '24 12:09 ffrank

ping @purpleidea

ffrank avatar Sep 22 '24 21:09 ffrank

Just had a look. TBQH, I'm not the expert here, so I don't know how to proceed, sorry.

purpleidea avatar Sep 23 '24 17:09 purpleidea

Maybe if we did a pair programming we might be able to come up with some ideas? LMK

purpleidea avatar Sep 28 '24 15:09 purpleidea

I'm traveling atm but maybe we can make it happen in November.

ffrank avatar Oct 21 '24 12:10 ffrank

Sure LMK when ready.

purpleidea avatar Oct 21 '24 14:10 purpleidea

First complete PoC implementation.

$ ./mgmt run lang examples/lang-errors/simple_types.mcl 
This is: mgmt, version: 0.0.26-206-g1b0b92f2
Copyright (C) 2013-2024+ James Shubin and the project contributors
Written by James Shubin <[email protected]> and the project contributors
14:53:14 main: start: 1735221194038676643
14:53:14 cli: lang: lexing/parsing...
14:53:14 cli: lang: import: /home/felix/mgmt/examples/lang-errors/
14:53:14 cli: lang: running type unification...
14:53:14 cli: lang: possible type issue found at line 4 column 3
14:53:14 main: goodbye!
cli parse error: could not unify types: unify error with: topLevel(func() { <built-in:_operator> }): type error: str != int

(note the final log message before the goodbye)

This breaks some very brittle unit tests. I did not touch those yet, it might be worth refactoring them rather than doing the trivial fix.

One big missing piece is that we cannot easily determine file names (i.e., from which file was a given parser token read), so in a complex code base with multiple mcl files, the message will still be quite ambiguous.

ffrank avatar Dec 26 '24 13:12 ffrank

I will clean it up in the next few days. This was mainly a proof of concept implementation. Since you agree with the approach, I will rework it into passable code.

Thanks for the hints and pointers.

ffrank avatar Jan 05 '25 20:01 ffrank

Do you have any feedback on the issue of linking AST nodes to source file names?

The data structures that you pointed me to appear to be too unspecific. This is the best I can produce right now:

01:08:54 cli: lang: possible type issue found at line 4 column 30
01:08:54 cli: lang: it's in one of these files: [/home/felix/mgmt/examples/lang-errors/includable.mcl /home/felix/mgmt/examples/lang-errors/including.mcl]
01:08:54 main: goodbye!

I suspect we need to do more work in order to make this information available.

ffrank avatar Jan 19 '25 10:01 ffrank

Thanks Felix!

I did some WIP hacking (not finished) here:

https://github.com/purpleidea/mgmt/tree/feat/line-numbers

purpleidea avatar Feb 05 '25 14:02 purpleidea

Oh shit I just remembered why the locate calls were often at the bottom of the parser blocks as opposed to the top, where the posLast calls used to happen. I put some of them at the bottom in order to make sure that the code that creates an actual AST node has actually run before I try to save location data in the node. D'oh 🤦🏻‍♀️ Running the locate first thing will usually fail because the node interface connected to the parser symbol is still nil.

Moving all of them to the bottom.

ffrank avatar Feb 06 '25 12:02 ffrank

I have adopted most of your suggested changes, and feel mostly happy with the code.

Leaving this in Draft status while we still reason about this. Will squash when we are fully happy.

When I tried to add a test, I realized that this will work if the mcl code is in my local directory, but it does not work when running from a Deploy. So I am looking into this. (Also I believe there is an existing test that segfaults now, so that I will check as well.)

We are breaking the lexparse test (tests?) with a lot of errors like this:

        lexparse_test.go:2243: test #71: diff:
             {
              Body: [
               {
            +   TextArea: {
            +   },
                Ident: "fn",
                Value: {
            +    TextArea: {
            +    },
                 Args: [
                 ],
                 Body: {
            +     TextArea: {
            +     },
                  V: 42,
                 },
                },
               },
              ],
             }

How to deal with it?

ffrank avatar Feb 06 '25 18:02 ffrank

I realized that this will work if the mcl code is in my local directory, but it does not work when running from a Deploy

Are you asking how to test?

Look in lang/interpret_test/TestAstFunc2/

There are .txtar files and some of them have a magic "#err: XXX" prefix... You can put something there to match on the error.

We are breaking the lexparse test (tests?) with a lot of errors like this:

That's OK, leave them broken for now, we can fix them up when the patch is ready for merging.

purpleidea avatar Feb 06 '25 23:02 purpleidea

Getting there, some more small issues to discuss:

  • what to do about leading spaces in error messages
$ ./test/test-mclfmt.sh 
running ./test/test-mclfmt.sh
FAIL: The following mcl files are not properly formatted: lang/interpret_test/TestAstFunc2/fortytwoerror.txtar
  • i would like to keep an example with wild spacing (mixing tabs and spaces) to show that the error reporting deals with it correctly (it shows the error marker in the right vertical alignment)
$ ./test/test-misc.sh 
running ./test/test-misc.sh
examples/lang-errors/multiline.mcl:6: space before tab in indent.
+	  	 41 + 1
failed with: block (util.go:397) failed: line 14 is not reflowed properly

No clue how to deal with that issue, but also this function is not super helpful and not used now, so I'd be okay to just drop it.

ffrank avatar Feb 12 '25 23:02 ffrank