doc-en icon indicating copy to clipboard operation
doc-en copied to clipboard

Strings documentation regarding backslash

Open greatgraphicdesign opened this issue 3 years ago • 6 comments

ref: https://www.php.net/manual/en/language.types.string.php

The description under “Single quoted” says, “To specify a literal backslash, double it (\);” however, in the example we see...

// Outputs: You deleted C:\*.*?
echo 'You deleted C:\*.*?';

... so it seems it is not necessary to double the backslash. Should the instruction for doubling a literal backslash be removed from this section? Alternately, it could be clarified to state that doubling the backslash is optional, and either a single or double backslash will result in a single backslash.

In the “Double quoted” section we see, “As in single quoted strings, escaping any other character will result in the backslash being printed too.” This statement references a table of escaped characters that applies to double quoted strings, but not single quoted strings. I would cut the phrase “As in single quoted strings...” to avoid leading people to think that the table has anything to do with single quoted strings. It's clear enough to say, “Escaping any other character will result in the backslash being printed.”

greatgraphicdesign avatar Sep 07 '22 18:09 greatgraphicdesign

To specify a literal single quote, escape it with a backslash (). To specify a literal backslash, double it (\). All other instances of backslash will be treated as a literal backslash:

IMO, that is relatively clear: you don't need to double the backslash (unless it is followed by '), but doubling is always possible, and usually a good idea (unless in regex patterns).

cmb69 avatar Sep 07 '22 20:09 cmb69

Is this more clear?

To specify a literal single quote, escape it with a backslash (\'). A single backslash (\) is treated as a literal backslash, but you can also specify a literal backslash by doubling it (\\). All other instances of backslash will be treated as a literal backslash:

greatgraphicdesign avatar Sep 08 '22 05:09 greatgraphicdesign

Yeah, that might be better (although we should change the grammar to avoid personalization).

What do others think about this?

cmb69 avatar Sep 08 '22 09:09 cmb69

To specify a literal single quote, it needs to be escaped with a backslash (\').
In all other cases, a single backslash (\) is treated as a literal backslash, however it is also possible to specify a literal backslash by escaping it (\\).
Therefore, other commonly used escape sequences such as \r or \n will be output literally as specified rather than having any special meaning. 

Sounds better to me as it gets rid of the personalisation and the weird fact that we're explaining that a backslash is interpreted literally while just after saying it again.

Girgias avatar Sep 08 '22 10:09 Girgias

Also, when a request string (e.g., $_POST['something']) is assigned to a variable — every character is passed literally. This might be worth mentioning somewhere. Thanks for considering this feedback. :)

greatgraphicdesign avatar Sep 08 '22 18:09 greatgraphicdesign

A created string literal must be escaped from "backslash" for "single quote" or "double quote" delimiters as the language interprets it as leading and trailing, the "backslash" character must be escaped however php interprets correctly even if it is not doubled, in "double quote" if the next character has no special meaning for php it is interpreted the character "backslash" literally. There are exceptions in a regex that needs to be doubled further if its output is to appear as the "backslash" is a special character, if other than the delimiters, e.g. the value string of a file or the value string of a variable is already interpreted but for security reasons the "backtick" character like shell_exec() must be escaped from "backslash" if not delimited by "single quote" or "double quote".

<?php

error_reporting(-1);
$var1 = '\'';
$var2 = $var1;
$var3 = "\"";
$var4 = $var3;
$var5 =  `ls -al`;
$var6 = "/(\\\\)[^a]+\$/";
$var7 = "\000"; // decimal notation \0 or greddy decimal notation (from \00 to \0) and octal notation \000 this new syntax $var7 = "\xOO"
$var8 = "\a";
$string = 'hi\\ world';
preg_match($var6, $string, $matches);
var_dump($var2, $var4, $var5, $matches, $var7, $var8);
// https://www.domain.tld/index.php?a=" create $_GET['a']
// https://www.domain.tld/index.php?a=%22 create $_GET['a']
// var_dump($_GET['a']); // Output "

?>

If shell_exec is disabled you will see the Warnings otherwise the result below is similar but without the Warnings

Warning: shell_exec(): Unable to execute 'ls -al' in /in/Psk4h on line 8
string(1) "'"
string(1) """
bool(false)
array(2) {
  [0]=>
  string(7) "\ world"
  [1]=>
  string(1) "\"
}
string(1) ""
string(2) "\a"

hormus avatar Nov 23 '22 22:11 hormus