pomsky Deprecate `<%` and `%>`

Status Quo

The meaning of <% and %> is not intuitively clear, and even confusing to some: Someone suggested that <% should indicate the end of the string, because the < angle should point towards the string, and vice versa for %>. Even worse, in right-to-left (RTL) languages the directions are reversed. Therefore it doesn't make sense to speak about the "left" and "right" end of a string, only the start and end of the string.

In the latest version, two built-in variables, Start and End, were added that are currently aliases for <% and %>.

Solution

Deprecate <% and %>. Suggest using Start and End instead.

Deprecation schedule

[x] Add the built-ins Start and End
[x] Update the documentation to recommend these built-ins
[x] Emit a deprecation warning in the CLI and the playground when the old syntax is used
[x] Remove documentation for the old syntax
[ ] Remove the old syntax in the next breaking release [target: 0.7]

Jun 19 '22 22:06 Aloso

Please allow the usage of ^ and $ instead of <% and %>. This project already copies a lot of regex's basic primitives like + and * for repitition. I don't think we need a new syntax for those, the previous ones are normal and convenient.

A lot of people are already familiar with them. It is a very very common standard. I feel like ^ and % and $, which are close to each other in terms of the layout of the keyboard, also feel elegant as anchors for sentence and word boundaries.

In short, Vim uses $ and ^ for the beginning and end of lines. Sed uses them. Regular expressions use them. Vscode uses them in the finder. Everyone who uses regex probably knows them well enough, unlike the more obscure commands that people who use regular expressions frequently use. Anyone that uses vim on a daily basis must frequently use $ to navigate to the end of the line, so it is ingrained in their memory its meaning. Switching to purely Start and End is a little bit too verbose for my tastes.

I think the goal of the project can be to make regular expressions more scalable, but not necessarily to challenge these age-old and essentially solved problems. I imagine that people that come to use pomsky come not to learn new syntax, but just to refactor some unreadable regular expressions with comments. If we change the boundaries, then I think a lot of users are not going to appreciate that. Especially since it ain't broke.

My desired use-case for Pomsky is splitting long regular expressions into ones that have variables and comments, so I can understand how to write long regular expressions for matching complex types, as well as keeping the regex tidy and readable. I feel like the problem with normal regular expressions is that they are not documented, for the most part. Yes, I like a lot of the syntax changes, however, I am so used to $ and ^ from Vim and so I feel like the project should aim to be as close as possible to regex while maintaining some consistency.

Personally I feel like <% being two characters is inconsistent with % and ^ and $ being one character, so I was happy that I saw that it was deprecated, though I am more in favour with the common symbols, though I appreciate %.

Jul 18 '22 11:07 wmstack

@wmstack Thank you for your comment, you make some good points.

The reason why I chose variables instead of sigils is that sigils are harder to understand and to search for on Google if you're unfamiliar with them. However sigils are shorter and therefore easier to type, so they're preferable for the most common constructs, and when the sigil intuitively makes sense (e.g. >> and << are easy to remember due to the direction of the "arrows", unless you use a RTL script). But that leads to the question why % is a sigil, but not Start and End. By the way, one of the reason was that <% and %> didn't work in the Rust macro because Rust would treat them as 2 distinct tokens 😄

^ and $ are not intuitive, but you make a good point that most programmers are familiar with them. My main concern was that Pomsky should be approachable for people learning Pomsky who are completely unfamiliar with regexes, but that is probably a very small target group. So I am inclined to allow ^ and $ in addition to Start and End.

If anybody has an opinion about this, please leave a comment or give a 👍 or 👎 reaction.

P.S. I think I'll make this an experimental feature at first, so that you will have to opt-in with --experimental-features, or in the playground with a toggle.

Jul 18 '22 13:07 Aloso

pomsky pomsky copied to clipboard

Deprecate `<%` and `%>`

Status Quo

Solution

Deprecation schedule

pomsky
pomsky copied to clipboard