pratt icon indicating copy to clipboard operation
pratt copied to clipboard

Minimal abstraction without Affix?

Open quark-zju opened this issue 2 years ago • 1 comments

Hi, after reading http://web.archive.org/web/20180213221021/http://effbot.org/zone/simple-top-down-parsing.htm I was looking for a Rust library. I prefer this library to prattle since the latter depends on failure which is no longer recommended.

However I found some questions about the design choices here. Namely, the Affix enum. My understanding is that the bare minimal abstraction of a Pratt parser does not include Affix:

  • token.lbp(): binding power used by the main parse loop to figure out when to stop
  • token.nud(): for literals, unary operators, groups
  • token.led(left: Token): for infix operators
  • Inside nud and led they can call parse(rbp) (expression in the linked blog) to get an Output with binding power limitation.

I think the problem of Affix is that it requires the token (Input) to tell us whether it's prefix or infix. Sometimes it does not know. I think the reason for using the (heavy weight) lalrpop in the README example is because of this. lalrpop is used to figure out whether - is prefix or infix. I would expect a slightly different design that the main parse loop (parse_input) provides the information whether it needs prefix or suffix (call nud for prefix, call led for infix) and the token is not responsible for that. I'd like to see an example that uses a simple tokenizer without using lalrpop or pest.

How do you think about the following design:

/// Minimal interface for a Pratt parser.
trait BasePrattParser<Inputs> {
    type Error, Input, Output;
    fn parse_input(&mut self, input: &mut Inputs, rbp: Precedence) { /* main algorithm */ }
    // no "info: Affix". If these function want it, call query(&head) to get it.
    fn nud(&mut self, head: &Self::Input, tail: &mut Inputs) -> Result<Output> { error("unsupported") }
    fn led(&mut self, head, tail, lhs) -> Result<Output> { error("unsupported") }
    fn lbp(&self, head: &Input) -> Precedence { Precedence(0) }
    // no query, primary, infix, prefix, postfix, parse_peekable
}

Then a higher level implementation that introduces the Affix concept (maybe slightly differently to address the problem above).

quark-zju avatar Dec 04 '22 01:12 quark-zju

Hi, the design looks very nice. Maybe we could add this as a lower-level interface for those who want more flexibility and fewer assumptions.

segeljakt avatar Dec 05 '22 09:12 segeljakt