old-design-docs icon indicating copy to clipboard operation
old-design-docs copied to clipboard

Multiple ws rules

Open cognominal opened this issue 9 years ago • 3 comments

sigspace is a great Perl 6 feature but sometimes is too limited. When designing a grammar one may want a sigpace for certain rules and another for other rules. My motivation is a grammar for parsing a .git/config file. Some rules match things within a line while other possibly match with sigspaces that span multiple lines.

I tried different approaches, to no avail. Two of them make me thinks of possible changes to S05.

The first approach was to split the grammar in two with each one having its own ws rules, and one derived from the other. But due to virtual method dispatching the more derived is always pickage. Is virtual method dispatching really appropriate for rules? I have not yet thought it out knowing that a grammar can contain regular method as well.

  grammar Unilines {
      token  ws  { \h*                                                  }
      rule header  {  '[' <id>  ']'                                   }
      rule entry   {  $<nm>=\S+ [ '=' $<val>=\S+ ]?    }
      token id      {   \w+                                             }
      token string {  \" <( [ '\\"' | \V ]* )>  \"                  }
  }

  grammar Config is Unilines {
    token ws {  [ { \h* <[ ;# ]> \N* \n ]+                      }
    rule TOP     {  <section> +                                  }
    rule section {  <header> <entry>  +                   }
 }

I then thought I could use lexical ws methods but their call being implicit, I cannot use <&ws> in rules to get to then. Maybe it should be possible to declare a lexical rule with a trait that indicates that it should be (conceptually) tried before regular method dispatch.

Short of explicit spacing, is there good alternatives I have missed for problems that need multiple sigspaces. I realise that, in this case, the comments don't have to be matches within a sigspace.

cognominal avatar Jul 16 '15 22:07 cognominal

It might end up looking ugly, but I'd go for tokens only and call the right ws* implementation directly.

FROGGS avatar Jul 17 '15 06:07 FROGGS

Another option that I've used (I believe in a TOML parser) is to create the class with the inner-level white space inside the outer grammar declaration, the rules of which can be called with <MyInner::token123>, though it may be trickier to get action methods on those.

Mouq avatar Jul 19 '15 07:07 Mouq

On Thu, Jul 16, 2015 at 03:07:36PM -0700, Stéphane Payrard wrote:

sigspace is a great Perl 6 feature but sometimes is too limited. When designing a grammar one may want a sigpace for certain rules and another for other rules. My motivation is a grammar for parsing a .git/config file. Some rules match things within a line while other possibly match with sigspaces that span multiple lines.

I tried different approaches, to no avail. Two of them make me thinks of possible changes to S05. [...]

Note that S05 allows :sigspace to have an argument specifying a rule to be used instead of the default <.ws>. Perhaps that is more along the lines of what you're looking for?

I don't know if Rakudo implements the argument form of :sigspace yet.

Pm

pmichaud avatar Jul 19 '15 15:07 pmichaud