pegjs
pegjs copied to clipboard
[Feat. Req.] Reversing the parsing process
Issue type
- Bug Report: No
- Feature Request: Yes
- Question: Yes
- Not an issue: No
Prerequisites
- Can you reproduce the issue?: Yes
- Did you search the repository issues?: Yes
- Did you check the forums?: What forums?
- Did you perform a web search (google, yahoo, etc)?: Yep!
Description
Reversing the progress of the parsing would be awesome.
Expected behavior:
Turning an object back into the format that was parsed.
Actual behavior:
I can only parse the format, not turn the object back into the parsed format.
Hi. I don't understand the goal and context (why) of your request. A concrete sample could help.
Parsing recognizes a string as a part of a language, and optionally, construct an AST or evaluate (interpret) an expression in the form: parse(text) --> result
This parse() function, in general, is not warrantied to be a bijective function. Therefore, there is no warranty you can reverse the parsing, as you can discard details during parsing, for example.
Given an AST you can generate expressions into text corresponding to a language. This is done as part of code generation and you can use template engines to transform ASTs into source code. But this task is out of scope of a parsing library.
Although personally, I understand the feeling of wanting the software to be able to generate what it can parse, as @pjmolina summarised, this is outside of the scope of this library (parser generator), so I will be closing this.
I suggest you just make your own code generator, on use an existing one for the language you are parsing (you'll have to make sure the AST is correct).
This ticket makes a lot of sense. Which else library would you expect to converts ASTs to source code given the grammar in PEG.js syntax? Yes, currently there is no way to do that. Most of the AST is generated in actions, and it's impossible to revert them. But when there's syntax for AST generation that we talked about in a different issue, it suddenly not only makes sense, but is possible to implement.
StringTemplate, for example, is one of the many tools you can use for such task: code-gen from ASTs.
I don't think StringTemplate reads grammars in PEG.js syntax.
Context:
To revert back to the original text an AST (Abstract Syntax Tree) would not be sufficient. For example given a JavaScript Grammar these two statements would normally have the same AST even though their syntax is slightly different,
// statement 1
var x = 5
// statement 2 - with semicolon
var x = 5;
To be able to revert back to the original text the Parser's actions must be an Injective function.
Suggestions:
I personally don't see how such a feature can be part of a parsing library that relies on embedded users actions to create the output structure.
Maybe in the future if pegjs would have some kind of automatic Parse Tree creation this would be feasible.
Build Your Own
Assuming that inside pegjs embedded actions the full position information is available you can insert your own custom embedded actions to build a CST / ParseTree and only transform it to an AST in a post parsing phase. Once you have a CST recreating the original input is fairly trivial...
Evaluate A Parsing library with automatic CST / Parse Tree creation
You can find several candidates here: https://tomassetti.me/parsing-in-javascript/
I've decided to take another look at this, but it will have to be post-v1 before it can be implemented because that will be the point where the API is stable and I can release a package that offers a basic common AST structure (e.g. @pegjs/ast
?). This package could be used by parser developers to derive their own AST from, and if needed, can be used to translate it back to the source by another tool (I'm thinking @pegjs/reverse
)
@futagoza how's it going with this?
@Coffee2CodeNL pegjs v1 is not out yet.
Take this SVG transform parser in PEGJS for example: https://github.com/nidu/svg-transform-parser It would be fantastic to be able to reverse the process and transform the AST back to a string, so SVG transforms can be easily modified, not just parsed. One such tool that is similar is Augeas.
Can someone recommend some tools to generate code from AST ?
@lzane Which AST?
@polkovnikov-ph some customized AST generated by PEG
Are there any tool which read the PEG grammar can do the code generation job?
PEG.js doesn't produce AST. Actions do. Generating text back from AST produced by arbitrary code is impossible. Even if a set of actions is limited, there is a problem with missing data (what should [ \t]+
produce?). At best, the library would provide functional lenses for code transformation, but this is far from what people do in JS, and more into lands of Haskell et al.
(On the other hand, you're completely right that it would be nice to have such a thing, and I even made some experiments to bring it to JS. The library is not out yet.)