highlight icon indicating copy to clipboard operation
highlight copied to clipboard

Add Antlers support

Open SRWieZ opened this issue 1 year ago β€’ 14 comments

Hi Brendt! Love your work with console and highlight

Here is my contribution to support Antlers template language of Statamic.

It was pretty fun to code. I like that way we can write tests in attributes. I reused some Patterns from Xml and PHP, hope it's ok.

I noticed that there is no CSS for operators, is that a bug ?

❓Need help on one edge case

I tried several things to fix this bug without success. Is there a way to prevent injection into another injection ? One thing I didn't tried is to put #[After] on AntlersCommentInjection and remove other injections from $content, is this the way ?

image

πŸ“‹ Before merging

  • [ ] Fix antlers comments having antlers tags being highlighted
  • [x] Write more tests
  • [x] If #123 passes there is a little changes to make

Until you have time to review, here is a little preview πŸ˜ƒ

🏞️ Preview

http://localhost:8080/?target=targets/antlers.md

image

SRWieZ avatar May 16 '24 15:05 SRWieZ

Pull Request Test Coverage Report for Build 9115368811

Details

  • 27 of 186 (14.52%) changed or added relevant lines in 15 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-8.6%) to 86.815%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/Languages/Antlers/Patterns/AntlersTagParameter.php 1 2 50.0%
src/Languages/Antlers/Patterns/ModifierPattern.php 1 2 50.0%
src/Languages/Antlers/Patterns/NumberPattern.php 1 2 50.0%
src/Languages/Antlers/Patterns/VariablePattern.php 1 2 50.0%
src/Languages/Antlers/Injections/AntlersTagInjection.php 1 4 25.0%
src/Languages/Antlers/Patterns/KeywordPattern.php 0 3 0.0%
src/Languages/Antlers/Injections/AntlersCommentInjection.php 0 4 0.0%
src/Languages/Antlers/Patterns/OperatorPattern.php 0 4 0.0%
src/Languages/Antlers/AntlersCommentLanguage.php 0 5 0.0%
src/Languages/Antlers/AntlersTagLanguage.php 0 136 0.0%
<!-- Total: 27 186
Totals Coverage Status
Change from base Build 9109618163: -8.6%
Covered Lines: 1521
Relevant Lines: 1752

πŸ’› - Coveralls

coveralls avatar May 22 '24 06:05 coveralls

Need help on one edge case

This happens because of how patterns and injections are parsed, the problem actually isn't in the comment injection or pattern, it's in AntlersTagLanguage. Because this is a sub-language, it doesn't know about its scope. Normally, the highlighter will make sure that no tokens can live within a block that's already a comment, but because AntlersTagLanguage is injected, it negates all those checks.

The solution would be to keep the comment pattern and all patterns related to AntlersTagLanguage together in one language. My proposal would be to have AntlersLanguage, and ditch AntlersTagLanguage. That means you can also ditch AntlersCommentLanguage, since it's already covered with the CommentPattern. The only two injections you'll need to keep are AntlersEchoInjection and AntlersPhpInjection, since those two are actually injecting separate languages.

I noticed that there is no CSS for operators, is that a bug ?

I actually plan on merging #123 soon, so I guess you'll want to wait and apply patterns for those newly added tokens as well?

Let me know if you need more help! You can also join our Discord to talk in real time :)

brendt avatar May 22 '24 06:05 brendt

FYI https://github.com/tempestphp/highlight/pull/123 has been merged :)

brendt avatar May 23 '24 10:05 brendt

Thanks for your answers!

I'm already on the discord but I'm only opening Discord when someone needs me there.

I will join the vocal channel when I reopen this PR. It may be only next week thought.

Should I convert it as draft in the meantime ?

SRWieZ avatar May 23 '24 13:05 SRWieZ

Yes it ca be draft :)

brendt avatar May 24 '24 03:05 brendt

Hi @brendt πŸ™‚,

I tried with the suggested approach and resolved the comment issue but this came with a lot more complexity.

  1. For every pattern I have to make sure that it only captures what's in {{ }} which makes me write ugly pattern like {{(?:(?!{{|}}).)*?(?<match>{$quoted})(?:(?!{{|}}).)*?}}
  2. I also have a hard time matching a pattern twice in the same Antlers tag '/{{[^{}]*\|\s*(?<match>\w+)(?:\([^)]*\))?([^{}]*\|\s*(?<match>\w+)(?:\([^)]*\))?)*[^{}]*}}/' this obviously does not work because I can't declare two "match" matching group.

I'm sorry, It's difficult to explain with words, reading my tests including the ones that I commented should give you more insights.

I think Antlers tag should be an injected language, same has Blade is injecting php in a blade directive. @if($variable < number)

It's may be easier to find a way around the comment injection being prioritised πŸ˜…

Also, I feel like I've done something wrong by rebasing the PR. I would reopen a new PR if needed.

SRWieZ avatar May 28 '24 15:05 SRWieZ

Also, I feel like I've done something wrong by rebasing the PR. I would reopen a new PR if needed.

Makes sense πŸ‘

I tried with the suggested approach and resolved the comment issue but this came with a lot more complexity.

Gotcha. Blade is different in that most tags are written with @ and not between {{ brackets. Are you sure you can't write this:

{{(?:(?!{{|}}).)*?(?<match>{$quoted})(?:(?!{{|}}).)*?}}

As something more like this?

{{\s*(?<match>{$quoted})(?:(?!{{|}}).)\s*}}

brendt avatar May 29 '24 03:05 brendt

Unfortunately not, it's slightly worse.

This $quoted example is to match Operators like "=>" so it will need to catch more than whitespace and in this example, it will only match the second tag because it will only see this example as one tag: {{ ab => cd }} should only match last tag => {{ c=>d }}

And it doesn't help my second point that is my biggest concern.

Here is my current regex and where it fails: https://regex101.com/r/gMpPJU/1

I want it to match every "=>" on those lines

{{ if a => b && c => d }} 
{{ {if a => b} && {c => d}  }}

I can write all the failing tests and maybe you can help me TDD this πŸ˜„

The more I think about it, the more I think it has to be a sub language because I don't think we can do a recursive pattern in PHP PCRE without doing 2 successive preg_match_all

SRWieZ avatar May 29 '24 15:05 SRWieZ

The more I think about it, the more I think it has to be a sub language because I don't think we can do a recursive pattern in PHP PCRE without doing 2 successive preg_match_all

Yeah you're probably right… but then we're stuck with the comments. However, we could fix comments like this:

  • match everything between {{ and }} as a separate Antlers sub language β€” name tbd
  • Comments will be matched as well # comment #, we can write a pattern for this within that sub language

I think that should fix all problems.

Sorry for going back and forth on this, it's definitely one of the more difficult languages to parse solely with regex…

PS: matching for that sub language should be as easy as {{(?<match>.*?)}} β€” I think.

brendt avatar May 30 '24 08:05 brendt

No worries, I knew from the beginning that it was not an easy one πŸ˜„

So I go back to injecting AntlersTagLanguage() next to AntlersPhpInjection() and AntlersEchoInjection(), but this time comment pattern should be in the sub language?

https://github.com/tempestphp/highlight/pull/124/commits/749bc421e8ec7b5adbf170db021db47c37485432#diff-e3e4595857edce00b02aa748801fcc680a123f0c105c0be5dd4d182b3f7f5469

Funny thing: Statamic was parsing Antlers with Regex in version 3, deprecated in version 4, recently removed in version 5. I'm sure it's for entirely different reasons, but worth noting.

SRWieZ avatar May 30 '24 09:05 SRWieZ

Does the refactor work?

brendt avatar May 30 '24 10:05 brendt

Sorry didn't do any changes, I referenced an old commit. It was working, except the comment thing.

SRWieZ avatar May 30 '24 10:05 SRWieZ

Ok I went back to the first version with the suggested changes. It works very well and is much more readable and efficient, except for the comments.

The current version renders like this, which I find even uglier: Capture d’écran 2024-06-02 aΜ€ 00 37 43

The first version was better I think. This example of the comment embedding a tag should not happen often.

I spent way too many hours on this, going in circles. I give up here. πŸ˜… I made one last effort writing lots of tests to help anyone who would like to take over. Including two failing tests.

SRWieZ avatar Jun 01 '24 22:06 SRWieZ

I'll try to work on this next week and see what I can come up with :)

brendt avatar Jun 07 '24 08:06 brendt

Currently not gonna work on this, but might revisit later

brendt avatar Aug 30 '24 06:08 brendt