nom icon indicating copy to clipboard operation
nom copied to clipboard

Rust backtraces in errors

Open enomado opened this issue 4 months ago • 7 comments

Having some experience writing parsers with nom, I think I now understand what I’m missing.

I want maximal, full backtraces, captured with Backtrace::capture(). Since parsers are combinators, they would show the exact place where the problem occurred across the whole stack.

Basically, I often try to figure out where a problem happened by replacing ? with .unwrap():

let (i, device) = parse_go(i)?;
let (i, _) = parse_brr(i)?;

becomes:

let (i, device) = parse_go(i).unwrap();
let (i, _) = parse_brr(i).unwrap();

Maybe there could exist something like:

let (i, _) = parse_brr(i).catch_backtrace_if_error()?;

that would work in debug mode? This would save hours of writing extra parsers in situations where the document format is unspecified, and the parser fails on the syntax of a new contractor, for example.

I’ll start my own experiments, but it seems the error types are rigidly fixed and there’s not much to add besides &str. Has anyone tried doing this already?

enomado avatar Aug 16 '25 00:08 enomado

yes, nom's error handling is pretty poor. neither shows where the error occured in code or the location in the source. but this is really a problem within the whole rust ecosystem, as rust discourages backtraces in favor of custom error enum types.

dvc94ch avatar Aug 25 '25 15:08 dvc94ch

@dvc94ch Rust can actually record backtraces. The Rust ecosystem also has dyn Error/anyhow with backtraces and exact code locations, though it is slow.

nom could generate backtraces for debugging purposes, but I’m not sure it has a proper error type design for that.

But in the worst case, we could store the traceback in a global variable and then read it later.

enomado avatar Aug 25 '25 20:08 enomado

Some context here: nom was designed for parsers in production applications, that would go through large amounts of files or significant network bandwidth. Having easy to read backtraces is not the goal in those cases, you want to reject bad input as quickly as possible. That said, nom's error management is also designed to be pluggable. If you use the most basic type, it's fast but raw. But you can also use the VerboseError type that accumulates a sort of trace, or nom-supreme's more complete ErrorTree. So with a little bit of digging and changing a type you can get what you need

Geal avatar Aug 25 '25 21:08 Geal

I know anyhow exists, that's why I said discouraged not impossible. Encouraged would be, anyhow is part of std library and used by 95% of crates, not get issues opened on your crates being told that "proper error handling makes your crate unusable, please use thiserror instead".

VerboseError looks interesting, but it seems like it no longer exists [0]? nom-supreme is unmaintained and doesn't work with nom 8.0.0, still can't get anything useful out of it using the open PR to update nom, because the ErrorTree requires Display and I'm parsing a semi binary format (pdf).

  • [0] https://docs.rs/nom/latest/nom/error/index.html

dvc94ch avatar Aug 26 '25 09:08 dvc94ch

VerboseError was moved to the nom-language crate, to allow it to evolve outside of nom's slow cycles. But I don't think it would be useful here. There's a difference between the error type you need in the end application, where you may (or may not) want to report useful errors to the user, and what you need to debug the parser's behaviour in development. For that case, nom-tracer or nom-tracable would be better. AFAIK they do not support nom 8 yet, but upgrading them should be feasible. Downgrading to nom 7 is also fine, it is pretty stable and if you only use the function oriented API, should be very compatible with nom 8. For a complex format like PDF though, it might be even better to come up with a specific error type that accumulates the context you need. A pretty important point in this library's philosophy is that everything is public, everything is pluggable, you can extend it or replace the parts you don't like, and it will still work well with everything else. So don't hesitate to write the parts you need

Geal avatar Aug 26 '25 10:08 Geal

just managed to parse my first pdf. used #[tracing::instrument(skip_all)] on all functions to figure out what the issue was. thanks for the pointers...

dvc94ch avatar Aug 26 '25 11:08 dvc94ch

You can easily add context to relevant parser steps to get your own stack of parser context. See e.g how it’s done here https://github.com/amqp-rs/amq-protocol/blob/main/types/src/parsing.rs You could even capture the backtrace in the add_context method impl. This is fully nom 8, with no changes required in nom itself as the error API, as « minimal » as it can be, is flexible enough for this

Keruspe avatar Aug 26 '25 11:08 Keruspe