WordPress-Coding-Standards
WordPress-Coding-Standards copied to clipboard
Allow (s)printf to be used for sanitising/escaping
Is your feature request related to a problem?
Yes. When writing code like the following, it's flagged as unescaped (WordPress.Security.EscapeOutput.OutputNotEscaped):
printf(
'Foo %d Bar',
$id
);
With some string formats, the format itself can provide adequate sanitising/escaping support. Specifically, any format which does not treat the input as a string: b, c, d, o, u, x, and X all treat the input as an integer, while e, E, f, F, g, and G treat the input as a double (eg float).
Essentially, only %s is unsafe for use printf for arbitrary variables, as all the others have the effect of typecasting to a number.
In practice, this only really matters for the EscapeOutput sniff with printf, but in theory, sprintf can actually be used as a sanitisation function as well; sprintf( '%d', $foo ) is a (weird) way to sanitise inputs into integer strings. (I've never seen this in practice though.)
Describe the solution you'd like
printf should be allowed to output input variables with any non-%s specifier without requiring unneeded escaping.
The following code should pass:
printf( 'Comment count: %d', $comment_count );
printf( 'Request time: %.4f', microtime( true ) );
(etc)
Worth mentioning WP/I18nSniff.php already has a sprintf regex we could potentially use here.
In practice, this only really matters for the EscapeOutput sniff with printf, but in theory, sprintf can actually be used as a sanitisation function as well; sprintf( '%d', $foo ) is a (weird) way to sanitise inputs into integer strings. (I've never seen this in practice though.)
https://3v4l.org/LhRMp
It casts it to a string, not an integer.
I don't know how I feel relying on PHP's own typecasting as a way of sanitization/escaping. Especially since esc_html() (for instance) checks for invalid UTF-8 strings and handles special characters.
It casts it to a string, not an integer.
Yeah; to be clear, that's what I meant by "integer string" above. In an output context, the distinction doesn't really matter.
I don't know how I feel relying on PHP's own typecasting as a way of sanitization/escaping.
PHP's typecasting is already allowed to be used for sanitising so it's a larger change if we don't want that.
I wouldn't be concerned about invalid UTF-8 or special characters, because those will only apply if already in the static string in sprintf; we don't check for those in echo '...' as far as I know, so I don't think this is different there.
What I mostly care about is useless sanitisation for code like sprintf( '%d comments', $count ), where adding an int cast or intval/absint call doesn't add any value. It doesn't make it more explicit (the %d is already explicitly casting to an integer), and it decreases readability if anything.
I found this article https://dev.to/anastasionico/good-practices-how-to-sanitize-validate-and-escape-in-php-3-methods-139b and in there an interesting example:
$changePassword = sprintf(
'UPDATE users set password = "%s" WHERE id = "%s"',
$_POST['password'],
$_GET['id'],
);
Now, this looks like a banal example, but it's a valid code (even better would be to provide variables that come from somewhere instead of $_POST or $_GET, since those will get flagged by WordPress.Security.ValidatedSanitizedInput.InputNotValidated sniff). And it can be turned into a 'WordPress equivalent' code.
Without phpcs flagging this, somebody would say: this is ok, I don't need to sanitize anything, sprintf will take care of that for me.
But it doesn't really.
It's open to SQL injection.
I know that this example is the code that should really never ever be in the code, but that doesn't mean that it's not a possibility.
For the sake of the WordPress core I'd rather add an exception where needed (which can be done using a comment) than add a potentially unsafe function to a list of sanitization functions 🤷♂
That's using %s which is unsafe; I'm talking only about the non-string preparation, which casts the variable. That code example looks obviously insecure to me.
However, these example are safe in all cases:
sprintf(
'SELECT ID FROM wp_posts WHERE post_author = %d',
$_GET['id']
);
printf(
'Searching for author ID %d',
$_GET['id']
);
I wouldn't recommend doing this for other reasons (namely SQL preparation should be handled properly) but it's safe as is.
Essentially, what I'm saying is that a format string without %s (and variations) in it is safe.
But we'd need to have a specific sniff for that, instead of just adding it to a list of sanitizing functions, no? Or updating an existing one. @jrfnl will know more 🙂
Yep. I think special-casing it inside EscapeOutputSniff works best in that regard. I'm not a huge fan of special-cases, but this is something I see used a lot.
That's using
%swhich is unsafe; I'm talking only about the non-string preparation, which casts the variable. That code example looks obviously insecure to me.However, these example are safe in all cases:
sprintf( 'SELECT ID FROM wp_posts WHERE post_author = %d', $_GET['id'] ); printf( 'Searching for author ID %d', $_GET['id'] );I wouldn't recommend doing this for other reasons (namely SQL preparation should be handled properly) but it's safe as is.
Essentially, what I'm saying is that a format string without
%s(and variations) in it is safe.
Hi @rmccue, could you elaborate on why would %d be safer than %s?
%d casts the variable to a (signed) integer, the equivalent of running intval() on the parameter. For example, compare the following with an intentionally-malicious SQL injection:
$var = "0 OR 1=1; --";
var_dump( sprintf( "SELECT * FROM wp_posts WHERE post_author=%s", $var ) );
// string(53) "SELECT * FROM wp_posts WHERE post_author=0 OR 1=1; --"
var_dump( sprintf( "SELECT * FROM wp_posts WHERE post_author=%d", $var ) );
// string(42) "SELECT * FROM wp_posts WHERE post_author=0"
(Obviously, use proper preparation when you can.)
The same applies for other formatters; only %s will insert the string without casting to a number. (%c isn't really dangerous, since it's a very specialised use case where you'd actually be using it anyway.)
(My assumption here is the generated numbers are "safe in all cases". Technically, you could have an integer overflow problem or something more domain specific, but that's a different problem to what we're trying to detect here.)
thanks