doc
doc copied to clipboard
Remove invalid <|w> syntax
The docs currently refer to <|w>
as a synonym for <?wb>
, but this doesn't seem to be valid syntax – it's not spec'd in Roast, it isn't documented elsewhere, and it doesn't work in Raku AST. So this commit removes it from here.
See https://stackoverflow.com/q/78069120 for details.
I don't see a problem with <|w>
.
It works as desired:
% raku
Welcome to Rakudoâ„¢ v2023.05.
Implementing the Raku® Programming Language v6.d.
Built on MoarVM version 2023.05.
To exit type 'exit' or '^D'
[0] > "apa pz" ~~ / <|wb> p. /
ï½¢paï½£
[1] > "apa pz" ~~ / <|w> p. /
ï½¢pzï½£
The problem is <|wb>
which is mis-interpreted. Above, <|wb>
returns ï½¢paï½£
which is the wrong result. Maybe <|wb>
gets mis-interpreted as a quoted list? Shouldn't quoted lists forbid unescaped |
within?
https://docs.raku.org/language/regexes#Quoted_lists_are_LTM_matches
So DON'T merge this pull request because you will:
- Break everyone's correctly working
<|w>
Regex code, and - Still have the exact same problem with
<|wb>
without understanding its source.
@codesections @coke @lizmat @JJ @fecundf @doomvox
@jubilatious1 This PR is for the documentation, not for any actual functionality!
Thanks for clarifying, @lizmat !
-
What's wrong with a) keeping the
<|w>
syntax in the documentation, and writing spec-tests? Because<|w>
already works as advertised on thedocs
website. -
Assuming, how do we get an error thrown when
<|wb>
is incorrectly attempted (user meant<|w>
instead)?
@jubilatious1
What's wrong with a) keeping the
<|w>
syntax in the documentation, and writing spec-tests?
As I explained here, (though omitting "if I had my druthers" at the end because I ran out of comment space), I feel the same way you do.
but not if a core dev has to do any of the work beyond reviewing/merging.
My view is that if users (like me or you) want RAST to support a currently unroasted feature then it's incumbent on us to organize and do all the work, including writing relevant issues and PRs, including spec tests and altering Rakudo, and (not too forcefully) lobbying for merging the work if any other users (including core devs) initially object to it being added to roast and RAST.
how do we get an error thrown when
<|wb>
is incorrectly attempted (user meant<|w>
instead)?
Rakudo would need to be altered to reject <|foo>
unless foo
is a single character, and, for now at least, only the single characters that have already been implemented. (That may well mean just <|w>
. I haven't looked into it further than my SO answer.)
I'm not willing to take the lead on this, but if you are, I will do everything in my power to make it a successful effort, where "successful" means we go through the process to the end, regardless of whether the end is adding <|w>
to roast and RAST or <|w>
ultimately being declined as a feature.
After looking at the S06 section @raiph linked, I see the merit of the <|...>
syntax. For example (if implemented), writing `<|d> would be a handy way to match either end of a run of digits.
But even if <|
were implemented, it should be documented as a separate type of zero-width assertion -- not mentioned in passing in a section on word boundaries.
helps prevent new code from being written that relies on [unroasted] behavior.
Sorry if I'm being presumptious, but I'm not convinced you've understood the situation. Here's what we're talking about:
say 'abc' ~~ / a <|quick> b <|brown> c /; # ï½¢abcï½£
That is to say, if some code of a particular form is written (specifically of the form <|w>
) and the w
is not actually w
but some other letter, or even multiple letters, then the compiler accepts it -- and then does nothing whatsoever with it.
So we'd be preventing someone doing the combination of A) writing a particular form of nonsense code that quite possibly no one has ever written until it was typo'd a few days ago; and that person then B) relying on their nonsense code doing nothing at all; and then that person C) not mentioning to anyone this odd (non) behavior they're relying on!
I think the chances of that recurring are minuscule, and if it did, the consequences would be minuscule. My guess is you hadn't realized it's this minuscule.