parsel
parsel copied to clipboard
Added Stringify Method
This pull request is an attempt to solve this issue https://github.com/LeaVerou/parsel/issues/27
I have constructed this by recursively computing the AST generated from parsel.parse
Changes done on other API
- I have added a new option
includeCommaInList
inparsel.parse
, as commas are present in the AST, thus making the stringification function to fail - I have also added the
actualContent
key for the node's content which are trimmed ( combinator and comma ) and thisactualContent
is added tocomplex
type nodes asactualCombinator
- I have added a new npm script named
watch
to make the development process easier
@LeaVerou Thanks for updating the status of the merge request.
actualContent should probably be called raw, by convention from other parsers
As per your suggestion, key names like actualContent and actualCombinator is changed to raw
watch is welcome too, but in the future please open separate PRs for that sort of thing.
In future I will definitely open separate PRs for different features
includeCommaInList should be a separate PR discussed on its own merit
Actually this is required for this stringify API itself.
just stringify each array element and .join(", ")
Stringification function cannot be done based on assumptions that there will be a single space following the comma. This suggestion might seem like it won't break the functionality of the css selector but what matters is the string output required by the user.
Use case for Comma Node Inclusion Parameter
Let us assume, I have to recreate the exact string written by the user from the AST Tree. Now, if the user makes some changes in the AST Tree,I want the stringified output to be the same in places where the user has not changed the AST Tree. If the stringified output is written in a file and if the output is not same, in places where it is expected to be same, then git diff will become a mess.
To give you a perspective on what happens if I remove includeCommaInList
and add join method just like you suggested
/**
// Css Block in my usecase
.foo,
.bar,
.baz{
color : red;
}
*/
var selector = `.foo,
.bar,
.baz`
var ast1 = parsel.parse(selector); // As per your suggestion
var ast2 = parsel.parse(selector, { includeCommaInList : true }); // With Comma node in the list
console.log(parsel.stringify(ast1));
// .foo, .bar, .baz
console.log(parsel.stringify(ast2));
/**
* .foo,
* .bar,
* .baz
*/
According to me, a stringify function must reconstruct the exact input string if nothing from the AST tree is modified. Moreover, If the user wants to remove those spaces, it will be easier to replace multiple spaces to a single space using a regex replace but it will be difficult to do the other way around. So Comma node is required in the AST Tree.
This option includeCommaInList
is not just added for the sole reason of making the stringify function easier to code and for implementation convenience, I have added that boolean option, so that the parse outputs from the previous version and the new version with this feature remains the same. Only users who are going to use the stringify are asked to include this option and the AST Tree will have the Comma node for proper stringification.
If this option is not present and if I direct add an extra node in parsed output, it might break the previous version users due to backward incompatibility in outputs.
I hope, I'm clear and this detailed explanation for adding includeCommaInList
option in the parse method is sufficient
You do not need two stringify functions, one recursive and one that just calls the recursive function. You can do it in the same function.
@LeaVerou as per your suggestion, Combined both stringify functions
Do note that it's an important use case that this function needs to be able to stringify individual nodes correctly, as users may want to call this function to perform their own stringification after modifying a selector.
I have added some test cases ( as hidden ones ) which I used to ensure the proper stringification of individual nodes
Reserialization is not about re-creating the exact same string, but an equivalent string that preserves the same functionality.
E.g. when you do element.innerHTML
or element.style.cssText
you don't get the exact same HTML or CSS that was typed, but equivalent code that would produce the same result. Same with JSON.stringify(JSON.parse(obj))
, whitespace is not preserved. If you want the exact same string, you may as well just store the entire string and save yourself the method call. Once you go through parsing/serialization, it's universally accepted that you're gonna lose some detail.
Also, serialization should be independent of parsing. You may be serializing data structures you didn't create. You cannot ask for specific parsing options to be able to serialize, you just serialize what you got, as best as you can. I could kind of accept the argument that IF that option was used then the result is even closer to the original selector, but to require the option to be able to stringify is just poor API design. 🤷🏽♀️
element.innerHTML or element.style.cssText, JSON.stringify(JSON.parse(obj)), whitespace is not preserved
Instead of these APIs, comparing parsel with espree and escodegen will be more relevant comparison
If you want the exact same string, you may as well just store the entire string and save yourself the method call
I said, 'I want the stringified output to be the same in places where the user has not changed the AST Tree', like the stringify function should only modify the places where the user has changed anything and it should leave other parts as such, So saving the string won't provide me the functionality which I expect.
Also, serialization should be independent of parsing ... but to require the option to be able to stringify is just poor API design.
Again, I think you have misunderstood the purpose of the includeCommaInList
option. Its main functionality is to prevent backward incompatibility issues.
Let us assume, a developer using parsel.parse method and he manipulates the parsed object using indexes in node of type list. So if this includeCommaInList
option is not provided and a new node [ Comma Node] is added to the parse output, His code will break, leading to backward incompatibility issues. Why should a developer who is not using the stringify function be affected by this output change.
All of this could be avoided, if there was enough information present in the parser output to properly stringify the object. I can understand that parsel's objective was to lint the selector rather than stringifying it back to its original state.
So to prevent this problem, I used a boolean includeCommaInList
option wrapping the Comma Node addition part, So that it won't affect the previous version parsel.parse
users. Only users who are going to stringify has to include this option.
Once you go through parsing/serialization, it's universally accepted that you're gonna lose some detail.
So now there are only 2 options after removing includeCommaInList
option
1 - Adding Comma node in the parser output so that the stringified output will be a near perfect string - Cons : this might have backward compatibility issues 2 - Using join method and recreating string with losses - Cons : the output won't be the same
@LeaVerou, please let me know which option to choose, Based on that I will modify the method and complete this merge request.
Sure, you want to compare it to JS parsers? Run esprima.parse("var i= [1, \t 2,]")
in the console here. You will see that the AST is identical to that produced for esprima.parse("var i = [1, 2]")
.
I do understand why you have added this parameter, you really don't need to explain it multiple times to me. 😀 I'm just saying it's not a good idea to litter the AST with useless cosmetic tokens for the sole purpose of stringifying it later preserving everything exactly. Mos stringification use cases do not require this. It's also a fool's errand: even with all these changes, you are still not preserving it exactly, down to the indexes. E.g. consider ".foo .bar"
vs ".foo \t .bar"
.
I also don't think the other tokens you added are needed (e.g. actualContent
), for the same reason. Stringification needs to produce equivalent code, not identical code down to the whitespace.
Let us assume, a developer using parsel.parse method and he manipulates the parsed object using indexes in node of type list. His code will break, leading to backward incompatibility issues.
As a side point, I understand English may not be your first language, you may find the singular they helpful for referring to unknown entities in a generic way.
Sure, you want to compare it to JS parsers? Run esprima.parse("var i= [1, \t 2,]") in the console here. You will see that the AST is identical to that produced for esprima.parse("var i = [1, 2]").
I know espree won't preserve whitespace. I said that if you are comparing parsel, compare it with similar parsers. it will be more relatable with AST tree.
it's not a good idea to litter the AST with useless cosmetic tokens
This depends on the usecase of the user. I can accept that most users ( not everyone ) will be ok with your version of string reconstruction, but litter and useless cosmetic is not a term I would use to refer those keys in the AST.
I also don't think the other tokens you added are needed (e.g. actualContent), for the same reason. Stringification needs to produce equivalent code, not identical code down to the whitespace.
I will remove those.
As a side point, I understand English may not be your first language, you may find the singular they helpful for referring to unknown entities in a generic way.
Sure. Thanks for pointing it out, In future I will refer in generic way.
Closed in favor of https://github.com/LeaVerou/parsel/pull/68