logos Possibility of using logos' core

Hi! First of all, thanks for writing this super cool crate and making Rust's ecosystem more robust.

I'm the original developer of pest and was wondering what would be the best way to take advantage of some of the technology use in logos. It seems like a lot of the simpler production rules could take advantage of a similar tree approach. I'm also working on a higher level framework to improve pest's grammar compilation, but I have yet to decide on an intermediate representation that would deliver the best results.

This is why I'm curious whether you would be interested perhaps separating the derive into a more generic core crate that would be reusable in other projects. The goal here is to offer a good experience for people new to parsing and to bring as much of the technology we write here in the Rust community reusable and well-integrated, so any other ideas or feedback would be most appreciated.

Nov 26 '18 10:11 dragostis

Hey! Thanks a lot, when I first saw Pest I was really blown away by how easy it is to use. I got to talk to a couple people using it, some of whom are beginning Rustaceans, and their experience is great. Pest is definitely one of those crates that helps driving Rust adoption!

As for extracting the pattern-tree-resolution engine into a core crate - absolutely! I'd be happy to collaborate and see what the requirements of Pest are. I reckon there might be some friction because I'm using Regex syntax as input, but then the tree actually doesn't look anything like Regex, so it should be possible to make it general enough not to have to change the syntax of .pest files.

Nov 26 '18 13:11 maciejhirsz

The are just a few difference between PEG (which pest uses) and a Regex engine:

everything matches eagerly; parse trees produced are always deterministic
lookaheads
named production rules

Maybe there is a simple enough way to make the parsing strategy generic, so as to be able to use eager matching. Adding lookaheads should be straight-forward. As for the tagged rules, that can be left for pest to handle; everything below it can highly-optimized eager Regex.

Nov 26 '18 14:11 dragostis

Actually, lookaheads will be probably the only tricky part.

Everything matching eagerly is how Logos works atm (.*? will fail to compile), so that's fine. Named production should be very easy to do by just swapping my Token markers with a generic.

I'll give extracting the tree resolution stuff into a crate a go this week, then we can try to square the circle of it into pest and see what changes are needed. I should also do some reading of the pest source code to get a better understanding of what it's doing (I have some assumptions, but assumptions tend to suck).

Nov 26 '18 15:11 maciejhirsz

pest's backend is not very stable right now. I'm planning to write an RFC in pest detailing how the next version (3.0) will use intermediate representation in order to do high-level optimizations, then have this IR generate Rust code that does the parsing. Once this IR is defined, it will be very easy to know exactly is needed.

Nov 26 '18 16:11 dragostis

After a bit more research, I think the best approach is to have pest deal with lookaheads itself, maybe optimize them statically if it can. Thus, the kind of IR Logos could help it would be:

UTF-8 strings
case-insensitive UTF-8 strings
UTF-8 character ranges
sequences like ab
ordered choices a|b, where b is matched only if a fails
eager bounded repetitions a{0,n}
eager unbounded repetitions a*
any character .
inversions [^a-z]

pest 3.0 should be able to statically optimize most expressions and inline most rules such that the work that Logos will do will be clear and precise. It should be able to start parsing from a specific index in a string and return an Option<usize> with the match position.

Dec 25 '18 13:12 dragostis

Hello @dragostis, I am trying to make this project live-on, and wanted to check if this issue / feature request was this requested?

Thanks :-)

Feb 07 '24 11:02 jeertmans

At this point, I'm out of the loop and have no time to invest in this, so I'm closing this issue.

Feb 07 '24 13:02 dragostis

logos logos copied to clipboard

Possibility of using logos' core

logos
logos copied to clipboard