code.pyret.org Problems with guessing error locations

Arity mismatches and a few other errors use a heuristic of "the top user-defined stack frame" to guess the location to report. If the stack doesn't have any frames (for whatever reason), or if the stack consists only of built-in frames, undefineds can result and break other parts of the rendering. This needs a more principled solution for what to show.

Sep 01 '14 22:09 jpolitz

I never much like the heuristic approach, since I could never remember when it picked the caller's location.

So... What if we added a new kind of srcloc called "caller-loc", which can be used when throwing errors and which would inform the stack renderer to pop the topmost frame when showing the stack?

What else besides arity mismatch still needs to report the caller's frame? On Sep 1, 2014 6:27 PM, "Joe Politz" [email protected] wrote:

Arity mismatches and a few other errors use a heuristic of "the top user-defined stack frame" to guess the location to report. If the stack doesn't have any frames (for whatever reason), or if the stack consists only of built-in frames, undefineds can result and break other parts of the rendering. This needs a more principled solution for what to show.

— Reply to this email directly or view it on GitHub https://github.com/brownplt/code.pyret.org/issues/20.

Sep 01 '14 22:09 blerner

I don't understand what a caller-loc is and where it gets stored, etc. What does it mean to use it when throwing errors?

I'm not really concerned with the stack trace showing an extra frame or not. I'm mostly concerned with thinking through the cases of what the caller could be (can it be nothing? can it be built-in code and what should we show in that case?)

Other than arity mismatch, things like array operations and reference initialization also don't take explicit source locations themselves, but (usually) have their caller as the most relevant code to blame before going into internals of the runtime.

Sep 01 '14 22:09 jpolitz

Error messages contain a source location that says where they occurred. The existing heuristic says "ignore that location, the fault is one frame lower down". I'm suggesting getting rid of the heuristic, and having the errors be raised with a source location field of "caller-loc", which will inform the renderer when to ducats a frame, target than use a heuristic. It means the error was raised from central, reusable code (the calle, or a runtime function, etc) and doesn't know where it was called from, but that the error should be blamed on its caller.

If there are no useful frames on the stack, I think all we can show is either "" or "". On Sep 1, 2014 6:37 PM, "Joe Politz" [email protected] wrote:

I don't understand what a caller-loc is and where it gets stored, etc. What does it mean to use it when throwing errors?

I'm not really concerned with the stack trace showing an extra frame or not. I'm mostly concerned with thinking through the cases of what the caller could be (can it be nothing? can it be built-in code and what should we show in that case?)

Other than arity mismatch, things like array operations and reference initialization also don't take explicit source locations themselves, but (usually) have their caller as the most relevant code to blame before going into internals of the runtime.

— Reply to this email directly or view it on GitHub https://github.com/brownplt/code.pyret.org/issues/20#issuecomment-54095271 .

Sep 01 '14 22:09 blerner

Error messages contain a source location that says where they occurred.

Errors don't always contain a location (or the ideal location). arity-mismatch only knows the location of the callee, and ref-init and invalid-array-index don't have a location at all, because the contexts that throw the error don't have a location available.

The location in the error message is never ignored that I know of (at least I didn't write code intending to do that!). Some errors just don't come with that information pre-attached for a number of reasons --- usually it's because function calls aren't compiled passing the location in, so errors that happen inside the runtime via function calls don't have a location passed in the way, say, getField does.

Sep 01 '14 22:09 jpolitz

OK... So then having caller-loc would/should allow all errors to have a location field, even if that field means "I dunno, who was my caller? It's their fault!" The fact that I keep getting confused by how our errors report their locations (by themselves out via heuristics) is frustrating to me, and I wish our design were simper to keep straight.

When do errors contain a "location that isn't ideal"? And what do we currently do about it? If it's the same "blame my caller" idea, then this new kind of srcloc would help there too.

(Note: if it wasn't clear, I'm using "caller" to mean "the first ancestor caller frame with a non-"builtin" srcloc, the same convention we currently use to skip runtime functions that are never with reporting.) On Sep 1, 2014 6:49 PM, "Joe Politz" [email protected] wrote:

Error messages contain a source location that says where they occurred.

Errors don't always contain a location (or the ideal location). arity-mismatch only knows the location of the callee, and ref-init and invalid-array-index don't have a location at all, because the contexts that throw the error don't have a location available.

The location in the error message is never ignored that I know of (at least I didn't write code intending to do that!). Some errors just don't come with that information pre-attached for a number of reasons --- usually it's because function calls aren't compiled passing the location in, so errors that happen inside the runtime via function calls don't have a location passed in the way, say, getField does.

— Reply to this email directly or view it on GitHub https://github.com/brownplt/code.pyret.org/issues/20#issuecomment-54095651 .

Sep 01 '14 23:09 blerner

When do errors contain a "location that isn't ideal"?

arity mismatch contains the location of the callee and not the caller. That's not ideal; ideal would be both.

Note: if it wasn't clear, I'm using "caller" to mean "the first ancestor caller frame with a non-"builtin" srcloc, the same convention we currently use to skip runtime functions that are never with reporting.

This convention is the entirety of what I intended this issue to be about. This definition of "caller" has been hard to keep straight and led to bugs in rendering when stacks contain no frames that fit this definition, and when the right caller was actually the second ancestor frame, which was true for a while. That's currently causing some arity mismatch errors to get swallowed.

If I understand right, caller-loc is a value that means "do the heuristic we're currently doing," and I think that will still be hard to maintain.

I don't think there's a particularly easy fix for this -- the runtime/compilation strategy ends up playing into what locations we have available and are easy to report, unless we change the calling convention to include the caller's location.

I think the most helpful thing would be to get the renderer for errors to be more easily testable, so we can know when it breaks because we made a compiler change. Right now the rendering is unfortunately stuck in a bunch of hard-to-test JS, and when something changes it's really hard to know if any renderings broke.

Sep 01 '14 23:09 jpolitz

Are there heuristics other than figuring out which location the caller is at?

Sep 01 '14 23:09 jpolitz

I'm trying to distinguish "when should the discard-some-frames heuristic trigger?" from "what frames should be discarded?". Currently, those are intertwined in the heuristic code. I'm suggesting the caller-loc value to be used to answer the when? question only.

We could enhance that value with a number specifying how many frames to discard. This moves the heuristic code to each error-raising location, which seems better to me since it's more obvious in the code surrounding each error-raiser what behavior we want, and we can change then without affecting others.

For arity errors specifically, why don't we just include two locations-: arity-mismatch(callee-loc, blame - caller-loc)? Would that help? On Sep 1, 2014 7:39 PM, "Joe Politz" [email protected] wrote:

When do errors contain a "location that isn't ideal"?

arity mismatch contains the location of the callee and not the caller. That's not ideal; ideal would be both.

Note: if it wasn't clear, I'm using "caller" to mean "the first ancestor caller frame with a non-"builtin" srcloc, the same convention we currently use to skip runtime functions that are never with reporting.

This convention is the entirety of what I intended this issue to be about. This definition of "caller" has been hard to keep straight and led to bugs in rendering when stacks contain no frames that fit this definition, and when the right caller was actually the second ancestor frame, which was true for a while. That's currently causing some arity mismatch errors to get swallowed.

If I understand right, caller-loc is a value that means "do the heuristic we're currently doing," and I think that will still be hard to maintain.

I don't think there's a particularly easy fix for this -- the runtime/compilation strategy ends up playing into what locations we have available and are easy to report, unless we change the calling convention to include the caller's location.

I think the most helpful thing would be to get the renderer for errors to be more easily testable, so we can know when it breaks because we made a compiler change. Right now the rendering is unfortunately stuck in a bunch of hard-to-test JS, and when something changes it's really hard to know if any renderings broke.

— Reply to this email directly or view it on GitHub https://github.com/brownplt/code.pyret.org/issues/20#issuecomment-54097285 .

Sep 01 '14 23:09 blerner

What I think would help most is to write a function, possibly in Pyret, that has the signature:

fun error-to-rendered-error(error :: RuntimeError, stack :: List<Srcloc>) -> ErrorMessageDoc:
  ...
end

Where ErrorMessageDoc is a data structure that the front-end knows how to render without guessing, doing introspection on the exception, etc. This function can be tested with empty stacks, stacks full of builtins, long stacks, short stacks, etc, in ways that we can't do easily with the current code because it is so JS/DOM heavy.

Even if we write it in JS, we should try to follow that signature so it's easier to test just what we're producing for rendering. Doing it in Pyret has the advantage that we could use it in console output as well.

We could do this with compiler and type errors, too.

Then error-ui.js would become the 50 or so lines required to call this correctly, highlight srclocs (or link srclocs to other frames/give builtin warnings), etc, and the whole thing will get tested with Pyret's testing.

caller-loc and some related helpers would probably make this function easier to implement. Mainly I think that the lack of something like this function as an intermediary between the error structures + stack and the actual DOM rendering is a long-term issue.

Sep 02 '14 00:09 jpolitz

I could get behind the design for a function like that, sure. Then my proposal simply becomes a recommendation that we try to push as much of the control flow of that function into the data fields of the error values themselves, so that function can be simplified where possible. On Sep 1, 2014 8:10 PM, "Joe Politz" [email protected] wrote:

What I think would help most is to write a function, possibly in Pyret, that has the signature:

fun error-to-rendered-error(error :: RuntimeError, stack :: List<Srcloc>) -> ErrorMessageDoc: ... end

Where ErrorMessageDoc is a data structure that the front-end knows how to render without guessing, doing introspection on the exception, etc. This function can be tested with empty stacks, stacks full of builtins, long stacks, short stacks, etc, in ways that we can't do easily with the current code because it is so JS/DOM heavy.

Even if we write it in JS, we should try to follow that signature so it's easier to test just what we're producing for rendering. Doing it in Pyret has the advantage that we could use it in console output as well.

We could do this with compiler and type errors, too.

Then error-ui.js would become the 50 or so lines required to call this correctly, highlight srclocs (or link srclocs to other frames/give builtin warnings), etc, and the whole thing will get tested with Pyret's testing.

caller-loc and some related helpers would probably make this function easier to implement. Mainly I think that the lack of something like this function as an intermediary between the error structures + stack and the actual DOM rendering is a long-term issue.

— Reply to this email directly or view it on GitHub https://github.com/brownplt/code.pyret.org/issues/20#issuecomment-54098188 .

Sep 02 '14 02:09 blerner

@jpolitz , @jswrenn : how much of this bug is still relevant?

May 11 '16 21:05 blerner

@blerner is this still an issue?

Mar 17 '20 02:03 schanzer

If this is still an issue it'll crop up as a fresh one. Closing.

Jan 11 '24 17:01 shriram