Discussion: A useful pattern for reducing boilerplate in parser functions
I find a good pattern to write chumsky parsers to share. You can just write like this:
// A common setup
type ParserInput<'tokens> = IterInput<TokenStream<'tokens>, Span>;
type ParserError<'tokens> = Err<Rich<'tokens, Token, Span>>;
Or else inputs. Just let the input to be ParserInput<'input>, and error to be ParserError<'input> for convenience. And you can write a macro:
/// in rust, we cannot yet do like this:
/// type Parser<'tokens, O> = impl chumsky::Parser<'tokens, ParserInput<'tokens>, O, ParserError<'tokens>>;
/// it needs nightly rustc and is unstable.
/// so i just use a macro to simplify it.
macro_rules! ParserOutput {
// take a param like `$output` ,which is `ty` (type)
($output:ty) => {
// expansion full impl Trait type
impl chumsky::Parser<
'tokens,
ParserInput<'tokens>,
$output,
ParserError<'tokens>
>
};
}
now you just need to mark the real output in your funtion return type:
/// parser an indent, return a symbol
fn indent<'tokens>() -> ParserOutput!(Symbol) {
select! {
Token::Identifier(symbol) => symbol,
}
.labelled("identifier")
}
/// parse a int, return a literal node
fn int_literal<'tokens>() -> ParserOutput!(ast::Literal) {
select! {
Token::IntConst(i) => ast::Literal::Int(i),
}
.labelled("integer literal")
}
its more cleaner than
fn indent<'tokens>() -> impl chumsky::Parser<'tokens, ParserInput<'tokens>, Symbol, ParserError<'tokens>> {
select! {
Token::Identifier(symbol) => symbol,
}
.labelled("identifier")
}
I think it maybe helpful.
Thanks for your contribution! My reason for being a little bit reticent is that the return type of parser function often needs to be fully-qualified for more complex cases, and leading users into the use of a macro is likely to make things more confusion for them.
One approach I've seen used often (which sadly still required nightly) looks like the following:
trait Parser<'src, O> = chumsky::Parser<'src, ParserInput<'src>, O, ParserExtra<'src>>;
fn indent<'src>() -> impl Parser<'src, Symbol> {
select! {
Token::Identifier(symbol) => symbol,
}
.labelled("identifier")
}
Arguably, this represents a similar improvement in readability.
For posterity, there is relevant discussion in #494 current occurring about what aspects of a nice solution to this might look like.