liquid icon indicating copy to clipboard operation
liquid copied to clipboard

Tag formatter/syntax fixer (WIP)

Open pushrax opened this issue 10 years ago • 9 comments

Problem

People have templates with broken syntax that is accepted by the lax parser but really makes no sense. We can't switch entirely to the strict parser because of this.

Potential Solution

Add a method to convert a Liquid pseudo-AST back into Liquid source. Then, to fix broken templates, this can be run on the lax-parsed AST and written back to the template source.

Notes

This has the side effect of reformatting everything, which is arguably good or bad. It only affects the inside of tags though, so the output should definitely be the same.

Tags that don't implement format will use the default, which is to just spit the markup back out. This is one of the motivations for #560 (unified syntax for tag arguments).

I'm not sure how to format for and tablerow, they accept arguments with commas and without. This is another reason for #560, there is no standard for how we treat arguments.

@fw42 @dylanahsmith @trishume thoughts? Any reason why this is a Bad Idea?

pushrax avatar May 12 '15 18:05 pushrax

:+1: Awesome work!

I've thought this was a good idea for a while so obviously no objections from me: https://vault.shopify.com/Liquid-Parsing-Quirks#_examples-of-parsing-weirdness_the-problem-why-this-is-bad_past-efforts_issues-with-fixing-the-problem_ideas-for-fixing-the-problem

Only thing is we have to be super careful about expressions because they don't have a strict parser and they are parsed at runtime rather than parse time so they don't have a pseudo-ast to format properly. The solution you have currently might work but I bet there are a lot of edge cases that can slip through. Example that might slip through this (haven't checked or looked at the code in a while) is something[[0]].

trishume avatar May 12 '15 20:05 trishume

@trishume actually expressions are parsed at parse-time since https://github.com/Shopify/liquid/pull/391 and https://github.com/Shopify/liquid/pull/443 :D

Definitely still some possible edge-cases though. The way I want to validate this is to format, reparse, and check that the ASTs are equal.

pushrax avatar May 12 '15 20:05 pushrax

Great! I've been out of the liquid loop for a while, good to know that problem has been fixed.

Speaking of me being out of the loop you should update the Liquid Parsing Quirks vault page with new info on the work and stats you got while doing the liquid-c work: https://vault.shopify.com/Liquid-Parsing-Quirks

I know that you did some parse tree comparisons on large samples of templates a while ago. The information I put on there about liquid-c is from back when we were attempting to make it 100% compatible with the lax parser :neutral_face:

trishume avatar May 12 '15 20:05 trishume

Hm I kinda feel like it would be cleaner if all those format methods would not be on the tags themselves. They are cluttering the code a bit, imho.

fw42 avatar May 13 '15 04:05 fw42

I think I would prefer this to be implemented in a more "unintrusive" way, as a separate class for example.

fw42 avatar May 13 '15 04:05 fw42

@fw42 I agree about it being separate. However, since parsing is implemented in the tags, and formatting is tightly coupled to the parse output, I think it is correct given the current architecture. Otherwise we need to either expose a bunch of ivars or use instance_variable_get. Ideally we would have a real AST and the parser would be separated completely, but :poop:

you should update the Liquid Parsing Quirks vault page

Definitely, good idea!

pushrax avatar May 13 '15 14:05 pushrax

So essentially this would be like gofmt auto formatting their liquid to have standard spacing in tags. It is an opinionated approach, but I think it would be fine.

Any reason why this is a Bad Idea?

The original intent of the code could be lost by fixing up the syntax to match what it actually does, although it might also make bugs more apparent, by making it more clear what it will actually do, so I think that would be fine.

We do need to be very careful about bugs, since this wouldn't be the type of thing that could easily be rolled back.

dylanahsmith avatar May 16 '15 19:05 dylanahsmith

@dylanahsmith this is about more than standard spacing like gofmt because of the fact that liquid's parser uses scan which ignores things that don't match. So when a tag like {% if true && false %} is parsed everything after the && will be ignored and not be put in the pseudo-ast. Then when it is formatted the output will be something like {% if true %} which does the exact same thing except doesn't include the invalid syntax.

trishume avatar May 16 '15 22:05 trishume

ya, I understand. That is what I meant by losing the original intent of the code.

dylanahsmith avatar May 17 '15 00:05 dylanahsmith