zend-escaper icon indicating copy to clipboard operation
zend-escaper copied to clipboard

Attribute escaping

Open autowp opened this issue 8 years ago • 5 comments

Which requires escaping a large number of characters in attributes? [^a-z0-9,\.\-_] URL's in html looks ugly and are larger than possible

<a href="https&#x3A;&#x2F;&#x2F;www.example.com&#x2F;">
<a href="https://www.example.com/">

autowp avatar Jan 23 '17 00:01 autowp

"Ugly" is not the problem when security-sensitive contexts. Also, most source viewers will already make these attributes simple to read (Firefox does, for example).

As for the size, gzip compression generally deals with it.

Ocramius avatar Jan 23 '17 00:01 Ocramius

That not easy to understand where is security improvements here.

For example, why "dot" is secure character but "semicolon" is not?

As for the size: On my example cyrillic page where escapeHtmlAttr partially used: 68988 bytes - escaped only quotes and angle brackets 83611 bytes - escaped by escapeHtmlAttr (+20%)

Same with gzip 11116 bytes 11790 bytes (+6%)

Indeed, the size is not crucial.

autowp avatar Jan 23 '17 09:01 autowp

Are you asking to add more characters to the whitelist, so they don't get encoded?

Maybe you could argue that certain characters like ":" don't need to be escaped, but it's easier to have a very small white-list of "known good" characters ([^a-z0-9,\.\-_]), than trying to work out which characters are allowed in each context.


For anyone not familiar with the background... the reason escapeHtmlAttr() encodes more aggressively than escapeHtml() is for non-quoted attributes.

Lets say someone did:

$url = 'https://www.example.com/';
<a href=<?= $escaper->escapeHtmlAttr($url) ?>>

Notice that it does not include quote marks.

This creates the fairly "ugly" output:

<a href=https&#x3A;&#x2F;&#x2F;www.example.com&#x2F;>

What happens if $url was provided by the user (maybe a link to their website), and they set it to:

$url = 'https://www.example.com/ onclick=do_evil_thing';

Without using escapeHtmlAttr(), it would create the perfectly valid:

<a href=https://www.example.com/ onclick=do_evil_thing>

This means they can create an onclick event handler on your website :-)


You could still use escapeHtml() or htmlspecialchars(), but you must make sure your attributes are quoted.

<a href="<?= $escaper->escapeHtml($url) ?>">

So that it creates:

<a href="https://www.example.com/">

Or, if you want to use htmlspecialchars(), don't forget to use it in full:

htmlspecialchars($url, ENT_QUOTES | ENT_SUBSTITUTE, 'utf-8')

PS: Have a look at adding a CSP (Content Security Policy), and set it so that it does not allow unsafe-inline for scripts or styles. This will probably require you to make some changes, but it adds a second line of defence against this problem, where any attributes like onclick would be blocked by the browser.

craigfrancis avatar May 18 '17 13:05 craigfrancis

@craigfrancis Thanks for your explanation! I think, this could improve the documentation.

froschdesign avatar Jun 01 '17 09:06 froschdesign

This repository has been closed and moved to laminas/laminas-escaper; a new issue has been opened at https://github.com/laminas/laminas-escaper/issues/3.

weierophinney avatar Dec 31 '19 21:12 weierophinney