scour icon indicating copy to clipboard operation
scour copied to clipboard

Sanitize potential security vulnerabilities from SVG files

Open ottok opened this issue 4 years ago • 9 comments

Here is a pretty good article on the topic: https://kinsta.com/blog/wordpress-svg/#svg-security. It links for more information to https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing, https://www.owasp.org/images/0/03/Mario_Heiderich_OWASP_Sweden_The_image_that_called_me.pdf

There is a PHP implementation at https://github.com/darylldoyle/svg-sanitizer.

Maybe Scour could include similar sanitizations?

ottok avatar Jun 06 '20 20:06 ottok

see my comment https://github.com/scour-project/scour/issues/29#issuecomment-640100963

oberstet avatar Jun 07 '20 05:06 oberstet

Scour was I guess created primarily to minimize and sanitize own authored SVGs, not security-scrutinize foreign SVGs.

So when I create an SVG myself, I will be aware of included (friendly) JS.

For browser loading SVGs from anywhere, this is different (and generally should be the place IMO to address this).

It is also different for HTML authors including SVGs not authored by themself.

Ideally, the HTML tags to include SVGs should have an option to turn off all the JS crap potentially embedded.

An sanitizer external to a browser (a HTML rendering engine) would not help, as the sanitizer cannot change the SVG sourced if it is not authored/hosted by the HTML author.

So what exactly is the attack scenario when I have authored/hosted the SVG myself? And if I haven't authored/hosted the SVG myself, but only source it from my HTML, then how does a sanitizer help?


that being said, quick look at the PDF linked.

  1. "Scripting and Events"
  2. "Inclusion of arbitrary objects"
  3. "Links"

bad, sure.

so let's say, I download an SVG from the net with the goal of hosting it myself, and including it then in my HTML. in that scenario, I probably not only want to minimize the SVG before uploading, but also remove any of the crap above.

  1. "In-line SVG “self-terminates” open HTML elements"

sound potentially bad. we can't do anything about that, as it is the browser HTML parser that's in control here ..

oberstet avatar Jun 07 '20 05:06 oberstet

ok, so there are indeed test cases in https://github.com/darylldoyle/svg-sanitizer/tree/master/tests/data

for embedded JS:

oberstet@intel-nuci7:~/scm/3rdparty/svg-sanitizer/tests/data$ grep "script" *.svg
badXmlTestOne.svg:<script>alert(1);</script>
hrefTestOne.svg:    <a href="javascript:alert(2)">test 1</a>
hrefTestOne.svg:    <a xlink:href="javascript:alert(2)">test 2</a>
hrefTestOne.svg:    <a href="javascript&#9;:alert(document.domain)">test 7</a>
hrefTestTwo.svg:    <a href="javascript:alert(2)">test 1</a>
hrefTestTwo.svg:    <a xlink:href="javascript:alert(2)">test 2</a>
hrefTestTwo.svg:    <a href="javascript&#9;:alert(document.domain)">test 7</a>
svgTestOne.svg:<script>alert(1);</script>

external content:

oberstet@intel-nuci7:~/scm/3rdparty/svg-sanitizer/tests/data$ grep 'xlink:href="http' *.svg
useDosCleanTwo.svg:        <image xlink:href="https://image.shutterstock.com/image-photo/beautiful-water-drop-on-dandelion-260nw-789676552.jpg"></image>
useDosCleanTwo.svg:        <image xlink:href="https://image.shutterstock.com/image-photo/beautiful-water-drop-on-dandelion-260nw-789676552.jpg"></image>
useDosTestTwo.svg:<image xlink:href="https://image.shutterstock.com/image-photo/beautiful-water-drop-on-dandelion-260nw-789676552.jpg"/>
useDosTestTwo.svg:<image xlink:href="https://image.shutterstock.com/image-photo/beautiful-water-drop-on-dandelion-260nw-789676552.jpg"/>
oberstet@intel-nuci7:~/scm/3rdparty/svg-sanitizer/tests/data$ 

other

oberstet@intel-nuci7:~/scm/3rdparty/svg-sanitizer/tests/data$ grep 'foreignObject' *.svg
oberstet@intel-nuci7:~/scm/3rdparty/svg-sanitizer/tests/data$ grep 'iframe' *.svg
oberstet@intel-nuci7:~/scm/3rdparty/svg-sanitizer/tests/data$ grep 'onload' *.svg
badXmlTestOne.svg:<line onload="alert(2)" fill="none" stroke="#000000" stroke-miterlimit="10" x1="119" y1="84.5" x2="454" y2="84.5"/>
hrefTestOne.svg:    <a href="data:data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' onload='alert(88)'%3E%3C/svg%3E">test 5</a>
hrefTestOne.svg:    <a xlink:href="data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' onload='alert(88)'%3E%3C/svg%3E">test 6</a>
hrefTestTwo.svg:    <a href="data:data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' onload='alert(88)'%3E%3C/svg%3E">test 5</a>
hrefTestTwo.svg:    <a xlink:href="data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' onload='alert(88)'%3E%3C/svg%3E">test 6</a>
svgTestOne.svg:<line onload="alert(2)" fill="none" stroke="#000000" stroke-miterlimit="10" x1="119" y1="84.5" x2="454" y2="84.5"/>
`` 

oberstet avatar Jun 07 '20 05:06 oberstet

If security is a concern, embedding SVGs in an <img> tag is the safest approach, as it will prevent the SVG from loading any JavaScript or external resources.
Are you aware of any security vulnerabilities that are not caught by this approach @ottok?

Within the scope of Scour I'd say support for stripping JavaScript and similar could be considered as a way of minimizing even further at the user's request, but it should be clear Scour is not a tool to detect and remove security vulnerabilities (and people should never rely on Scour to do so for security-critical applications). That said I expect that in most of the cases there will only ever be JavaScript in SVG files when the user put it there on purpose (and therefore probably does not want to remove it), so it also won't be an overly useful option.

Ede123 avatar Jun 07 '20 11:06 Ede123

embedding SVGs in an tag is the safest approach, as it will prevent the SVG from loading any JavaScript or external resources

nice! we should probably add this as a BIG NOTE / WARNING to the readme ..

does is also prevent any embedded JS (not externally referenced/loaded) from running?

does that prevent any loading of externally referenced content (eg non-JS) as well? because of privacy .. eg single-pixel images embedded just to track users. as HTML mails often do.

as a user I just want SVG to behave like a self-contained image.. no JS, no externally referenced content. I just want a vector image.

oberstet avatar Jun 07 '20 11:06 oberstet

Personally for me the use case would be to add Scour to my toolset (among optipng and jpegoptim) to run on all images I either produce myself, or to run automatically on images users upload on a website (e.g. their profile image).

Having the ability to sanitize SVG in Scour would be cool so I don't need to have a separate tool for that.

Now when all modern browsers support SVG, the only thing preventing SVG from being more common is the tooling to automatically strip out injected contents from them. This is for example the reason why WordPress does not allow SVG images to be used on WordPress sites (and thus users need to specifically install https://wordpress.org/plugins/svg-support/ and whatnot to get SVG support in WordPress). Personally I could use Scour in a WordPress pipeline to enable SVG on WordPress, is Scour had similar features as the above linked svg-sanitizer.

ottok avatar Jun 07 '20 13:06 ottok

run automatically on images users upload on a website

ok, yeah, got you. that use case makes sense for scour as well - and then we could have new options like:

--remove-embedded-js
--remove-embedded-objects
--remove-external-js
--remove-external-images
--remove-external-objects

oberstet avatar Jun 08 '20 15:06 oberstet

The options above sound good to me!

ottok avatar Aug 06 '20 03:08 ottok

Note that scour uses minidom to parse XML and Python has published the following warning about using minidom on "untrusted input":

Warning

The xml.dom.minidom module is not secure against maliciously constructed data. If you need to parse untrusted or unauthenticated data see XML vulnerabilities.

Source: https://docs.python.org/3/library/xml.dom.minidom.html XML Vulnerabilities: https://docs.python.org/3/library/xml.html#xml-vulnerabilities

IOW, we would have to replace/rewrite the XML parser first before it is safe to use it on untrusted input. AFAICT, defusedxml is the only parser that is "safe to use" (at least out of the box).

nthykier avatar Aug 13 '20 05:08 nthykier