A procedural macro for defining nom combinators in simple DSL


A procedural macro for defining nom combinators in simple DSL. Requires nom v5.0+ and nightly Rust toolchain.


nom = "7"
nom-rule = "0.2"


The procedural macro rule! provided by this crate is designed for the ease of writing grammar spec as well as to improve maintainability, it follows these simple rules:

  1. TOKEN: match the token by token kind. You should provide a parser to eat the next token if the token kind matched. it will get expanded into match_token(TOKEN).
  2. ";": match the token by token text. You should provide a parser to eat the next token if the token text matched. it will get expanded into match_text(";") in this example.
  3. #fn_name: an external nom parser function. In the example above, ident is a predefined parser for identifiers.
  4. a ~ b ~ c: a sequence of parsers to take one by one. It'll get expanded into nom::sequence::tuple.
  5. (...)+: one or more repeated patterns. It'll get expanded into nom::multi::many1.
  6. (...)*: zero or more repeated patterns. It'll get expanded into nom::multi::many0.
  7. (...)?: Optional parser. It'll get expanded into nom::combinator::opt.
  8. a | b | c: Choices between a, b, and c. It'll get expanded into nom::branch::alt.
  9. &a: Peek. It'll get expanded into nom::combinator::peek(a). Note that it doesn't consume the input.
  10. !a: Negative predicate. It'll get expanded into nom::combinator::not. Note that it doesn't consume the input.
  11. ^a: Cut parser. It'll get expanded into nom::combinator::cut.
  12. ... : "description": Context description for error reporting. It'll get expanded into nom::error::context.


Define match_text parser and match_token parser for your custom token type. You can use nom::combinator::fail as match_token if your parser use &str or &[u8] as input because you won't match on token kinds.

#[derive(Clone, Debug, PartialEq)]
struct Token<'a> {
    kind: TokenKind,
    text: &'a str,
    span: Span,

#[derive(Clone, Copy, Debug, PartialEq)]
enum TokenKind {

    // Keywords

    // Symbols


fn match_text<'a, Error: ParseError<Input<'a>>>(
    text: &'a str,
) -> impl FnMut(Input<'a>) -> IResult<Input<'a>, &'a Token<'a>, Error> {
    move |i| satisfy(|token: &Token<'a>| token.text == text)(i)

fn match_token<'a, Error: ParseError<Input<'a>>>(
    kind: TokenKind,
) -> impl FnMut(Input<'a>) -> IResult<Input<'a>, &'a Token<'a>, Error> {
    move |i| satisfy(|token: &Token<'a>| token.kind == kind)(i)

Then give the two parser to nom_rule::rule! by wrapping it into a custom macro:

macro_rules! rule {
    ($($tt:tt)*) => { 
        nom_rule::rule!($crate::match_text, $crate::match_token, $($tt)*)

To define a parser for the SQL of creating table:

let mut rule = rule!(
    CREATE ~ TABLE ~ #ident ~ ^"(" ~ (#ident ~ #ident ~ ","?)* ~ ")" ~ ";" : "CREATE TABLE statement"

It will get expanded into:

let mut rule = 
        "CREATE TABLE statement",

See more example in tests/ and the main dependant databend.

Auto Sequence

nom-rule is able to automatically insert ~ in the rule when necessary so that you get the example above working the same as the following:

let mut rule = rule!(
    CREATE TABLE #ident "(" (#ident #ident ","?)* ")" ";" : "CREATE TABLE statement"

To enable this feature, you need to add this to the Cargo.toml:

nom-rule = { version = "0.2", features = ["auto-sequence"] }