html-contextual-autoescaper-java
html-contextual-autoescaper-java copied to clipboard
'+' in opaque URL body of data: URL is renormalized incorrectly
Autoescaping this string should trigger the bug:
<a href='data:base64;a+b'>
The input is normalized to
I was surprised because the HTML encoding should not affect the URL seen by the browser and am loathe to stop encoding '+' because that stops UTF-7 attacks.
According to
$ python
import base64 base64.b64encode(' Hello, World!~Goodbye, World!') 'IEhlbGxvLCBXb3JsZCF+R29vZGJ5ZSwgV29ybGQh'
so I tested the HTML document
<p>
<a href='data:;base64,IEhlbGxvLCBXb3JsZCF+R29vZGJ5ZSwgV29ybGQh'>raw plus</a>
<p>
<a href='data:;base64,IEhlbGxvLCBXb3JsZCF+R29vZGJ5ZSwgV29ybGQh'>encoded plus</a>
on a recent version of each of Chrome, Safari, and Firefox. Both links link to equivalent documents in all three browsers, and the browser displays the '+' properly decoded in the URL bar. I have not tested on older browsers or on any IE.