chumsky icon indicating copy to clipboard operation
chumsky copied to clipboard

Discussion: A useful pattern for reducing boilerplate in parser functions

Open Karesis opened this issue 3 months ago • 1 comments

I find a good pattern to write chumsky parsers to share. You can just write like this:

// A common setup
type ParserInput<'tokens> = IterInput<TokenStream<'tokens>, Span>;
type ParserError<'tokens> = Err<Rich<'tokens, Token, Span>>;

Or else inputs. Just let the input to be ParserInput<'input>, and error to be ParserError<'input> for convenience. And you can write a macro:

/// in rust, we cannot yet do like this:
/// type Parser<'tokens, O> = impl chumsky::Parser<'tokens, ParserInput<'tokens>, O, ParserError<'tokens>>;
/// it needs nightly rustc and is unstable.
/// so i just use a macro to simplify it.
macro_rules! ParserOutput {
    // take a param like `$output` ,which is `ty` (type)
    ($output:ty) => {
        // expansion full impl Trait type
        impl chumsky::Parser<
            'tokens,
            ParserInput<'tokens>,
            $output,
            ParserError<'tokens>
        >
    };
}

now you just need to mark the real output in your funtion return type:

/// parser an indent, return a symbol
fn indent<'tokens>() -> ParserOutput!(Symbol) {
    select! {
        Token::Identifier(symbol) => symbol,
    }
    .labelled("identifier")
}

/// parse a int, return a literal node
fn int_literal<'tokens>() -> ParserOutput!(ast::Literal) {
    select! {
        Token::IntConst(i) => ast::Literal::Int(i),
    }
    .labelled("integer literal")
}

its more cleaner than

fn indent<'tokens>() -> impl chumsky::Parser<'tokens, ParserInput<'tokens>, Symbol, ParserError<'tokens>> {
    select! {
        Token::Identifier(symbol) => symbol,
    }
    .labelled("identifier")
}

I think it maybe helpful.

Karesis avatar Oct 14 '25 04:10 Karesis

Thanks for your contribution! My reason for being a little bit reticent is that the return type of parser function often needs to be fully-qualified for more complex cases, and leading users into the use of a macro is likely to make things more confusion for them.

One approach I've seen used often (which sadly still required nightly) looks like the following:

trait Parser<'src, O> = chumsky::Parser<'src, ParserInput<'src>, O, ParserExtra<'src>>;

fn indent<'src>() -> impl Parser<'src, Symbol> {
    select! {
        Token::Identifier(symbol) => symbol,
    }
    .labelled("identifier")
}

Arguably, this represents a similar improvement in readability.

For posterity, there is relevant discussion in #494 current occurring about what aspects of a nice solution to this might look like.

zesterer avatar Oct 27 '25 15:10 zesterer