parsimonious icon indicating copy to clipboard operation
parsimonious copied to clipboard

Using standard PEG syntax? [ was: make a new release? ]

Open mw66 opened this issue 3 years ago • 11 comments
trafficstars

The last release is:

https://pypi.org/project/parsimonious/ parsimonious 0.8.1 Released: Jun 20, 2018

That's ~4 years ago, can we make a new release?

Thanks.

mw66 avatar Mar 24 '22 06:03 mw66

Well timed! We have a new one coming very shortly!

erikrose avatar Mar 24 '22 12:03 erikrose

BTW, is there a standard PEG syntax?, e.g. parsimonious use = to define rules: lhs = rhs.

But elsewhere, I saw most people are using <-: e.g.

https://en.wikipedia.org/wiki/Parsing_expression_grammar https://nim-lang.org/docs/pegs.html https://github.com/PhilippeSigaud/Pegged/blob/master/examples/PEG/src/pegged/examples/PEG.d

etc.

I think the benefit of using standard syntax is that users can compare different library using the same grammar file, without have to change the syntax for each library.

Just wondering if we can add <- as alternative to = to define rules (so it's not a breaking change)?

mw66 avatar Mar 24 '22 16:03 mw66

Historic note for anyone looking at this in the future:

The syntax for Parsimonious uses = to define rules for two reasons:

  1. when Erik started this library, he was implementing from the original paper. Other implementations had no consensus it was unclear that the character was going to catch on as <-
  2. = was chosen because it felt more ergonomic to people comfortable programming in Python

(more about these here)

Now that some time has passed and a consensus is brewing around using <- the only thing keeping it from being added as an alternate syntax is the time and effort to write the patch.

lonnen avatar Mar 29 '22 08:03 lonnen

oh, one small reason to put it off: Parsimonious reverses the precedence of AND/OR compared to other PEG libs. It's a silly barrier, but this fix would give the illusion of compatability that isn't there.

see also:

lonnen avatar Mar 29 '22 09:03 lonnen

but this fix would give the illusion of compatability that isn't there.

Is there some reason it would have to be incompatible? It’s just a different grammar, so there could be one grammar parser for parsimonious classic, and one for exact compatibility with other peg parsers.

lucaswiman avatar Mar 29 '22 14:03 lucaswiman

I was unclear. Adding <- and an alternative syntax is absolutely feasible to preserve backwards compat within Parsimonious, but if this is done without also fixing AND/OR precedence it will invite dropping in grammar files that work with other parsers, but which will have unexpected behavior with Parsimonious

lonnen avatar Mar 30 '22 02:03 lonnen

Then how about it provides both:

= : keep the old AND/OR precedence rule.

<- : define the new AND/OR precedence rule.

mw66 avatar Mar 30 '22 02:03 mw66

The current AND/OR precedence is a bug. If we can resolve that, it should be the only behavior. All of this is the say we should fix it before or alongside adding syntax-compatibility with other PEG libraries

It's also been a tricky bug to fix, even when Erik was actively developing this lib. I don't think it is prudent to maintain both behaviors for the sake of backwards compat with pre-1.0 versions of Parsimonious.

That said - there's still no clear owner for the issue. If someone is interested, though, it would be a high utility improvement for Parismonious!

lonnen avatar Mar 30 '22 02:03 lonnen

I don't think it is prudent to maintain both behaviors for the sake of backwards compat with pre-1.0 versions of Parsimonious.

I agree, maybe we need a breaking change version.

BTW, I found a working Python PEG parser here:

https://github.com/we-like-parsers/pegen/blob/main/data/python.gram

It uses ":".

mw66 avatar Mar 30 '22 02:03 mw66

It's also been a tricky bug to fix, even when Erik was actively developing this lib. I don't think it is prudent to maintain both behaviors for the sake of backwards compat with pre-1.0 versions of Parsimonious.

Totally disagree with this. Unless the upgrade path is extremely easy and foolproof (like a function that converts an old grammar string to an equivalent new one), this would be breaking backwards compatibility for pretty questionable reasons: complying with some other parsers used by other people who aren't already using the library.

As a user of (and contributor to) parsimonious, with many functional grammar files, what other libraries are doing isn't very relevant unless it give some genuine functionality improvements. "pre-1.0" is sort of weak, since it is used in a lot of production systems.

The docs are pretty explicit. The README says:

I don't plan on making any backward-incompatible changes to the rule syntax in the future, so you can write grammars with confidence.

The comments on the grammar definition in the code says this: https://github.com/erikrose/parsimonious/blob/b6a6f5402fc370ffaa94dee2fac81ae4e0ab32e6/parsimonious/grammar.py#L216-L219

It does seem like supporting both syntaxes shouldn't be that bad. Ideally the changes would just be to the grammar, though it looks like https://github.com/erikrose/parsimonious/compare/master...lower-precedence-ors required some changes to the visitor as well. Maybe a transducer from the old syntax to the new syntax would be possible, or at least an interesting exercise.

That said - there's still no clear owner for the issue. If someone is interested, though, it would be a high utility improvement for Parismonious!

Now that python 2 support has been dropped 🙌 , I'm personally more interested in working on allowing parsing of bytes objects. However, I'd be very interested in helping to review / test changes to syntax or precedence. I'm very glad that this library is getting more development velocity now that you have commit access!

lucaswiman avatar Mar 30 '22 07:03 lucaswiman

@lucaswiman I appreciate the thoughtful comment! Would you mind putting it on #199? I cannot find an original issue for the bug so I've made a new one, and I'd like to keep the discussion in that issue

lonnen avatar Mar 30 '22 07:03 lonnen