doc
doc copied to clipboard
RFE: Should we have a section on Escape Characters?
Request for Enhancement in the Documentation:
May we please have a section in the documentation about Escape Characters?
To assist with this, I have pasted my keeper on the subject. Maybe I missed a few?
Many thanks, -T
Perl 6: Escape Sequences ("whitespace" characters):
print "Hi\tBye\n";
Reference: https://en.wikipedia.org/wiki/Escape_character
\' single quote
\" double quote
\\ backslash
\n new line
\r carriage return
\t tab
\b backspace
\f form feed
\v vertical tab (Internet Explorer 9 and older treats '\v
as 'v instead of a vertical tab ('\x0B). If cross-browser
compatibility is a concern, use \x0B instead of \v.)
\0 null character (U+0000 NULL) (only if the next character
is not a decimal digit; else it is an octal escape sequence)
Note that the \v is "Unrecognised" and \0 escapes is ignored
$ p6 "say 'I am a single quote\'';"
I am a single quote'
$ p6 'say "I am a double quote\"";'
I am a double quote"
p6 'say "I am a backslash\\";'
I am a backslash\
$ p6 'say "<I am a newline\>";'
<I am a newline>
p6 'say "<I am a carraie return\r>";'
>I am a carraige return
$ p6 'say "<I am a tab\t>";'
<I am a tab >
$ p6 'say "<I am a backspace\b>";'
<I am a backspac>
$ p6 'say "<I am a formfeed\f>";'
<I am a formfeed
>
$ p6 'say "<I am a verticle tab\v>";'
===SORRY!=== Error while compiling -e
Unrecognized backslash sequence: '\v'
$ p6 'say "<I am a null\0>";'
<I am a null>
Thanks a lot for the issue. The thing is escape characters are not part of the Perl 6 language, they are part of the Unicode, and previously ASCII, definition. Other than that, they are used all over the documentation when they are needed, for instance in regexes or when explaining the quoting construct I will however add some indexing and maybe a reference to make this clearer.
I don't think this patch is quite sufficient.
As this bug report states, \v is not a vertical tab in Perl 6 (in a regex it matches vertical whitespace, and it's not a legal escape in a quoted string literal). So that should be removed from the blurb in syntax.pod6.
Perl 6 uses \0 for a null character, but not for octal escape sequence. For that, use \o[NNN]. So this is a gotcha for people coming from C, and should be mentioned here.
The patch doesn't mention other backslash escapes: \x, \o, \e or \c. While some are mentioned elsewhere, there's no good place to see them all.
The patch doesn't mention \qq and friends, which would be good to mention here as well.
In other words, I do think adding a dedicated section to collect all of this in a single place for interpolated string literal escapes is a good thing for the docs. A good starting place is https://design.perl6.org/S02.html#Backslash_sequences and following.
On 9/16/18 10:05 PM, Juan Julián Merelo Guervós wrote:
Thanks a lot for the issue. The thing is escape characters are not part of the Perl 6 language, they are part of the Unicode, and previously ASCII, definition. Other than that, they are used all over the documentation when they are needed, for instance in regexes https://docs.perl6.org/language/regexes#Enumerated_character_classes_and_ranges or when explaining the quoting construct https://docs.perl6.org/language/quoting#Escaping:_q I will however add some indexing and maybe a reference to make this clearer.
Hi Juan,
Indeed the are not part of the Perl language. But they are frequently used by folks programming in Perl.
It would be nice it they were documented SOMEWHERE so that programmers could find them and use them.
-T
The thing is escape characters are not part of the Perl 6 language, they are part of the Unicode, and previously ASCII, definition
Well, for ones like \x6E then yes, the number there refers to a Unicode codepoint. However, the choice of things like \n, \r, \0 and so forth are very much part of the Perl 6 design. While there's a mostly commonly agreed core of them among many languages, there's also quite a lot of variance. And Perl 6 supports some decidedly less common syntax, like:
say "\x[65,66,67]"'; # efg
It's also worth noting that the quote language and the regex language support different escapes and can assign different meanings to the same ones, So they really need documenting separately for the two languages.
On 9/17/18 3:06 PM, Jonathan Worthington wrote:
The thing is escape characters are not part of the Perl 6 language, they are part of the Unicode, and previously ASCII, definition
I am not following you. Yes they came from elsewhere. And Yes, Perl responds to them. Perl's response to them is what needs to be documented.
I like your idea of one for quotes and one for regex.
I'm assuming that when you say "Perl's response" needs to be documented, we mean Raku (this is an old ticket).
@coke
Yes, Todd opened this issue before the rename.
Also I see that in his last comment Todd had either gotten very confused, or was being confusing, quoting jnthn (who had basically just fully agreed with Todd and softmoth) quoting JJ (who had instead presumably gotten confused about what was appropriate).
I also am confused, can you sum up what you think is left to do here?
I suspect I just added to the confusion. Sorry about that. As far as I'm concerned you can close this issue.
But while I'm here, let me mention two things for you to do with as you wish:
-
As jnthn wrote, it is our responsibility to accurately document all the escape characters Raku supports in all its slangs, and let readers know where they contradict each other, or existing related standards.
-
As many have said over the years, our doc search sucks. Yes we can endlessly add index entries, but that generates other problems. And I think google sucks as a fallback.
As it happens, my thoughts about the above two brought them to what might be a nice simple powerful conclusion if you run with it.
The Raku escape character \n in the Q lang could be said to be exactly the same as the ASCII equivalent, and its Unicode equivalent. And, really, it is. But checkout the verbiage above the table on the language/quoting page:
Several of these print invisible/whitespace ASCII control codes or whitespace characters:
Yes, a \n in a string literal represents a codepoint 10, just like the ASCII escape, so the table entry is technically accurate. But it's a lost golden opportunity to let readers know the truth about what happens when printing a \n. The latter isn't about the escape at all, but imagine how you'd feel if you were trying to figure out why a \n seemed to result in something other than a \n being printed, and you looked it up, and saw that table entry, thinking "Hmm, I guess it is just an ordinary \n...".
Fortunately, if you search for \n in the doc search field, and then select the google option, the search results include that page about what happens when printing a newline. So maybe all is good. But if you look at other matches you'll notice that many are just matches for n, not \n. And it's issues like that that radically undermine the value of google as a fallback search option.
Now try this. See the difference? It's still just a backup search, but suddenly one can search for exact symbols, and use regexes, and find every single possible match for what you're searching for, and be immersed in the very source that can be corrected if a searcher feels any of the doc they now discover could be tied together or cross referenced or whatever in a helpful way for folk that don't use the fallback search.
I suspect I just added to the confusion. Sorry about that. As far as I'm concerned you can close this issue.