pegjs icon indicating copy to clipboard operation
pegjs copied to clipboard

[Feat. Req.] Reversing the parsing process

Open Coffee2CodeNL opened this issue 6 years ago • 14 comments

Issue type

  • Bug Report: No
  • Feature Request: Yes
  • Question: Yes
  • Not an issue: No

Prerequisites

  • Can you reproduce the issue?: Yes
  • Did you search the repository issues?: Yes
  • Did you check the forums?: What forums?
  • Did you perform a web search (google, yahoo, etc)?: Yep!

Description

Reversing the progress of the parsing would be awesome.

Expected behavior:

Turning an object back into the format that was parsed.

Actual behavior:

I can only parse the format, not turn the object back into the parsed format.

Coffee2CodeNL avatar May 17 '18 19:05 Coffee2CodeNL

Hi. I don't understand the goal and context (why) of your request. A concrete sample could help.

Parsing recognizes a string as a part of a language, and optionally, construct an AST or evaluate (interpret) an expression in the form: parse(text) --> result This parse() function, in general, is not warrantied to be a bijective function. Therefore, there is no warranty you can reverse the parsing, as you can discard details during parsing, for example.

Given an AST you can generate expressions into text corresponding to a language. This is done as part of code generation and you can use template engines to transform ASTs into source code. But this task is out of scope of a parsing library.

pjmolina avatar May 18 '18 10:05 pjmolina

Although personally, I understand the feeling of wanting the software to be able to generate what it can parse, as @pjmolina summarised, this is outside of the scope of this library (parser generator), so I will be closing this.

I suggest you just make your own code generator, on use an existing one for the language you are parsing (you'll have to make sure the AST is correct).

futagoza avatar May 19 '18 07:05 futagoza

This ticket makes a lot of sense. Which else library would you expect to converts ASTs to source code given the grammar in PEG.js syntax? Yes, currently there is no way to do that. Most of the AST is generated in actions, and it's impossible to revert them. But when there's syntax for AST generation that we talked about in a different issue, it suddenly not only makes sense, but is possible to implement.

reverofevil avatar May 19 '18 11:05 reverofevil

StringTemplate, for example, is one of the many tools you can use for such task: code-gen from ASTs.

pjmolina avatar May 21 '18 07:05 pjmolina

I don't think StringTemplate reads grammars in PEG.js syntax.

reverofevil avatar May 21 '18 08:05 reverofevil

Context:

To revert back to the original text an AST (Abstract Syntax Tree) would not be sufficient. For example given a JavaScript Grammar these two statements would normally have the same AST even though their syntax is slightly different,

// statement 1
var x = 5

// statement 2 - with semicolon
var x = 5;

To be able to revert back to the original text the Parser's actions must be an Injective function.

Suggestions:

I personally don't see how such a feature can be part of a parsing library that relies on embedded users actions to create the output structure.

Maybe in the future if pegjs would have some kind of automatic Parse Tree creation this would be feasible.

Build Your Own

Assuming that inside pegjs embedded actions the full position information is available you can insert your own custom embedded actions to build a CST / ParseTree and only transform it to an AST in a post parsing phase. Once you have a CST recreating the original input is fairly trivial...

Evaluate A Parsing library with automatic CST / Parse Tree creation

You can find several candidates here: https://tomassetti.me/parsing-in-javascript/

bd82 avatar May 21 '18 11:05 bd82

I've decided to take another look at this, but it will have to be post-v1 before it can be implemented because that will be the point where the API is stable and I can release a package that offers a basic common AST structure (e.g. @pegjs/ast?). This package could be used by parser developers to derive their own AST from, and if needed, can be used to translate it back to the source by another tool (I'm thinking @pegjs/reverse)

futagoza avatar Sep 20 '18 03:09 futagoza

@futagoza how's it going with this?

Coffee2CodeNL avatar Feb 13 '19 21:02 Coffee2CodeNL

@Coffee2CodeNL pegjs v1 is not out yet.

reverofevil avatar Feb 13 '19 23:02 reverofevil

Take this SVG transform parser in PEGJS for example: https://github.com/nidu/svg-transform-parser It would be fantastic to be able to reverse the process and transform the AST back to a string, so SVG transforms can be easily modified, not just parsed. One such tool that is similar is Augeas.

strarsis avatar Sep 06 '20 14:09 strarsis

Can someone recommend some tools to generate code from AST ?

lzane avatar May 27 '21 02:05 lzane

@lzane Which AST?

reverofevil avatar May 27 '21 07:05 reverofevil

@polkovnikov-ph some customized AST generated by PEG

Are there any tool which read the PEG grammar can do the code generation job?

lzane avatar May 27 '21 07:05 lzane

PEG.js doesn't produce AST. Actions do. Generating text back from AST produced by arbitrary code is impossible. Even if a set of actions is limited, there is a problem with missing data (what should [ \t]+ produce?). At best, the library would provide functional lenses for code transformation, but this is far from what people do in JS, and more into lands of Haskell et al.

(On the other hand, you're completely right that it would be nice to have such a thing, and I even made some experiments to bring it to JS. The library is not out yet.)

reverofevil avatar Jun 01 '21 16:06 reverofevil