parsel icon indicating copy to clipboard operation
parsel copied to clipboard

Added Stringify Method

Open Pranomvignesh opened this issue 3 years ago • 7 comments

This pull request is an attempt to solve this issue https://github.com/LeaVerou/parsel/issues/27

I have constructed this by recursively computing the AST generated from parsel.parse

Changes done on other API

  • I have added a new option includeCommaInList in parsel.parse, as commas are present in the AST, thus making the stringification function to fail
  • I have also added the actualContent key for the node's content which are trimmed ( combinator and comma ) and this actualContent is added to complex type nodes as actualCombinator
  • I have added a new npm script named watch to make the development process easier

Pranomvignesh avatar Oct 24 '21 06:10 Pranomvignesh

@LeaVerou Thanks for updating the status of the merge request.

actualContent should probably be called raw, by convention from other parsers

As per your suggestion, key names like actualContent and actualCombinator is changed to raw

watch is welcome too, but in the future please open separate PRs for that sort of thing.

In future I will definitely open separate PRs for different features

Pranomvignesh avatar Jan 17 '22 19:01 Pranomvignesh

includeCommaInList should be a separate PR discussed on its own merit

Actually this is required for this stringify API itself.

just stringify each array element and .join(", ")

Stringification function cannot be done based on assumptions that there will be a single space following the comma. This suggestion might seem like it won't break the functionality of the css selector but what matters is the string output required by the user.

Use case for Comma Node Inclusion Parameter

Let us assume, I have to recreate the exact string written by the user from the AST Tree. Now, if the user makes some changes in the AST Tree,I want the stringified output to be the same in places where the user has not changed the AST Tree. If the stringified output is written in a file and if the output is not same, in places where it is expected to be same, then git diff will become a mess.

To give you a perspective on what happens if I remove includeCommaInList and add join method just like you suggested


/**
// Css Block in my usecase
.foo,
.bar,
.baz{
  color : red;
}
*/

var selector = `.foo,
.bar,
.baz`

var ast1 = parsel.parse(selector); // As per your suggestion

var ast2 = parsel.parse(selector, { includeCommaInList : true }); // With Comma node in the list

console.log(parsel.stringify(ast1));

// .foo, .bar, .baz

console.log(parsel.stringify(ast2));

/**
* .foo,
* .bar,
* .baz
*/

According to me, a stringify function must reconstruct the exact input string if nothing from the AST tree is modified. Moreover, If the user wants to remove those spaces, it will be easier to replace multiple spaces to a single space using a regex replace but it will be difficult to do the other way around. So Comma node is required in the AST Tree.

This option includeCommaInList is not just added for the sole reason of making the stringify function easier to code and for implementation convenience, I have added that boolean option, so that the parse outputs from the previous version and the new version with this feature remains the same. Only users who are going to use the stringify are asked to include this option and the AST Tree will have the Comma node for proper stringification.

If this option is not present and if I direct add an extra node in parsed output, it might break the previous version users due to backward incompatibility in outputs.

I hope, I'm clear and this detailed explanation for adding includeCommaInList option in the parse method is sufficient

Pranomvignesh avatar Jan 17 '22 20:01 Pranomvignesh

You do not need two stringify functions, one recursive and one that just calls the recursive function. You can do it in the same function.

@LeaVerou as per your suggestion, Combined both stringify functions

Do note that it's an important use case that this function needs to be able to stringify individual nodes correctly, as users may want to call this function to perform their own stringification after modifying a selector.

I have added some test cases ( as hidden ones ) which I used to ensure the proper stringification of individual nodes

Pranomvignesh avatar Jan 17 '22 20:01 Pranomvignesh

Reserialization is not about re-creating the exact same string, but an equivalent string that preserves the same functionality. E.g. when you do element.innerHTML or element.style.cssText you don't get the exact same HTML or CSS that was typed, but equivalent code that would produce the same result. Same with JSON.stringify(JSON.parse(obj)), whitespace is not preserved. If you want the exact same string, you may as well just store the entire string and save yourself the method call. Once you go through parsing/serialization, it's universally accepted that you're gonna lose some detail.

Also, serialization should be independent of parsing. You may be serializing data structures you didn't create. You cannot ask for specific parsing options to be able to serialize, you just serialize what you got, as best as you can. I could kind of accept the argument that IF that option was used then the result is even closer to the original selector, but to require the option to be able to stringify is just poor API design. 🤷🏽‍♀️

LeaVerou avatar Jan 17 '22 20:01 LeaVerou

element.innerHTML or element.style.cssText, JSON.stringify(JSON.parse(obj)), whitespace is not preserved

Instead of these APIs, comparing parsel with espree and escodegen will be more relevant comparison

If you want the exact same string, you may as well just store the entire string and save yourself the method call

I said, 'I want the stringified output to be the same in places where the user has not changed the AST Tree', like the stringify function should only modify the places where the user has changed anything and it should leave other parts as such, So saving the string won't provide me the functionality which I expect.


Also, serialization should be independent of parsing ... but to require the option to be able to stringify is just poor API design.

Again, I think you have misunderstood the purpose of the includeCommaInList option. Its main functionality is to prevent backward incompatibility issues.

Let us assume, a developer using parsel.parse method and he manipulates the parsed object using indexes in node of type list. So if this includeCommaInList option is not provided and a new node [ Comma Node] is added to the parse output, His code will break, leading to backward incompatibility issues. Why should a developer who is not using the stringify function be affected by this output change.

All of this could be avoided, if there was enough information present in the parser output to properly stringify the object. I can understand that parsel's objective was to lint the selector rather than stringifying it back to its original state.

So to prevent this problem, I used a boolean includeCommaInList option wrapping the Comma Node addition part, So that it won't affect the previous version parsel.parse users. Only users who are going to stringify has to include this option.


Once you go through parsing/serialization, it's universally accepted that you're gonna lose some detail.

So now there are only 2 options after removing includeCommaInList option

1 - Adding Comma node in the parser output so that the stringified output will be a near perfect string - Cons : this might have backward compatibility issues 2 - Using join method and recreating string with losses - Cons : the output won't be the same

@LeaVerou, please let me know which option to choose, Based on that I will modify the method and complete this merge request.

Pranomvignesh avatar Jan 18 '22 19:01 Pranomvignesh

Sure, you want to compare it to JS parsers? Run esprima.parse("var i= [1, \t 2,]") in the console here. You will see that the AST is identical to that produced for esprima.parse("var i = [1, 2]").

I do understand why you have added this parameter, you really don't need to explain it multiple times to me. 😀 I'm just saying it's not a good idea to litter the AST with useless cosmetic tokens for the sole purpose of stringifying it later preserving everything exactly. Mos stringification use cases do not require this. It's also a fool's errand: even with all these changes, you are still not preserving it exactly, down to the indexes. E.g. consider ".foo .bar" vs ".foo \t .bar".

I also don't think the other tokens you added are needed (e.g. actualContent), for the same reason. Stringification needs to produce equivalent code, not identical code down to the whitespace.

Let us assume, a developer using parsel.parse method and he manipulates the parsed object using indexes in node of type list. His code will break, leading to backward incompatibility issues.

As a side point, I understand English may not be your first language, you may find the singular they helpful for referring to unknown entities in a generic way.

LeaVerou avatar Jan 19 '22 13:01 LeaVerou

Sure, you want to compare it to JS parsers? Run esprima.parse("var i= [1, \t 2,]") in the console here. You will see that the AST is identical to that produced for esprima.parse("var i = [1, 2]").

I know espree won't preserve whitespace. I said that if you are comparing parsel, compare it with similar parsers. it will be more relatable with AST tree.

it's not a good idea to litter the AST with useless cosmetic tokens

This depends on the usecase of the user. I can accept that most users ( not everyone ) will be ok with your version of string reconstruction, but litter and useless cosmetic is not a term I would use to refer those keys in the AST.

I also don't think the other tokens you added are needed (e.g. actualContent), for the same reason. Stringification needs to produce equivalent code, not identical code down to the whitespace.

I will remove those.

As a side point, I understand English may not be your first language, you may find the singular they helpful for referring to unknown entities in a generic way.

Sure. Thanks for pointing it out, In future I will refer in generic way.

Pranomvignesh avatar Jan 19 '22 14:01 Pranomvignesh

Closed in favor of https://github.com/LeaVerou/parsel/pull/68

jrandolf-2 avatar Mar 16 '23 00:03 jrandolf-2