chumsky icon indicating copy to clipboard operation
chumsky copied to clipboard

Debugging: How, what, why?

Open zesterer opened this issue 4 years ago • 12 comments

Chumsky currently supports a primitive debugging system, allowing parsers to print to stdout when entered during a call to Parser::parse_recovery_verbose. Expanding this further will require some thought.

  1. What problems should debugging attempt to solve?
  • Parsers that consume zero input and repeat
  • Paths erroneously taken
  • Priority errors (i.e: a.or(b) vs b.or(a))
  1. What information needs to be shown to the user?
  • Entered parsers
  • Number of iterations
  • Source location of parser
  • Recursion points
  1. How is best to show this information?
  • Annotated tree?
  1. What API features should be supported?
  • Recursion limit to prevent stack overflows

zesterer avatar Nov 03 '21 08:11 zesterer

One debug feature I would find really useful is a way to print out the nested tuples that are the outputs of .map and .map_with_span.

Many of the parsers I'm writing end up having the data I need to build the AST buried several layers deep in nested tuples. Sometimes you can figure out the structure by looking at the combinators you used to build the parser, but other times the only way I was able to figure it out was through several guess -> compile -> error cycles until the data types lined up.

So a debug_print function I could drop in a map to print out the nested data structure would be really great.

natemartinsf avatar Nov 04 '21 04:11 natemartinsf

Debug-printing the output seems like a good idea, yes. Perhaps also the input too? It would be amazing to be able to generate a mapping between the two, a diagram that explains exactly which parts of the input get processed by specific parsers and shows the output AST that gets generated. I'm thinking something like this:

Input `x + y`
    => ...processed by the parser at line 37 in `parser.rs`..
    => ...generated output `Expr::Binary(BinaryOp::Add, Expr::Local("x"), Expr::Local("x"))`

What I'm wondering is how to organise this output such that it doesn't become too verbose to be useful. It's almost like it requires a flamegraph-esque SVG that can be navigated around or something.

zesterer avatar Nov 04 '21 10:11 zesterer

Along these same lines, a debugger tool that lets a user figure out why they are overflowing the stack would be great!

(Mentioning this because it's happening to me right now, and the "debug" parser doesn't print out if the stack overflows.)

natemartinsf avatar Nov 08 '21 00:11 natemartinsf

That's a good use-case. Perhaps I should also add a recursion limit to prevent this sort of thing.

zesterer avatar Nov 08 '21 10:11 zesterer

Does the debugging have to be through normal CLI output? Perhaps add a feature flag that enables a GUI that lets you see the output and step through the parsers. I've tried stepping though it with a normal debugger and didn't find it very helpful.

Person-93 avatar Nov 13 '21 13:11 Person-93

I'm still a little unsure about how best to output this information. CLI is definitely the most universal, but is not particularly easy to explore.

zesterer avatar Nov 13 '21 16:11 zesterer

Hello! How about improving the debug method we have now to output the input to the parser and its consumption, I'm thinking of something like the dbg function in Megaparsec in Haskell. Here's an example. https://markkarpov.com/tutorial/megaparsec.html#debugging-parsers

taka231 avatar Apr 17 '22 14:04 taka231

I'm increasingly of the view that this should be implemented as an extension trait on top of existing combinators rather than embedded into the crate as with master. Perhaps this will be the way forwards in zero-copy.

zesterer avatar Apr 17 '22 14:04 zesterer

This is related to (but not the same as) #280.

zesterer avatar Feb 20 '23 22:02 zesterer