bluemonday
bluemonday copied to clipboard
css sanitization in style attributes
Per the bluemonday docs:
We are not yet including any tools to help whitelist and sanitize CSS. Which means that unless you wish to do the heavy lifting in a single regular expression (inadvisable), you should not allow the "style" attribute anywhere.
We use bluemonday and would like allow a limited subset of CSS (essentially limit to a handful of property names) to be allowed within the style attribute.
I can dedicate a few cycles to enhancing bluemonday with this functionality, but per the contributing guidelines, it asks that I create an issue first.
Since we have the issue, I figure it's worth proposing an API and an approach.
api
For the API, I'd propose something similar to allowAttrs()
, for example allowStyles('font-size', 'text-align')
. We have no need for tag-specific styles, so I'd propose foregoing the complexity of the builder initially and just allowing the styles globally. This would cause an API change down the line though, should such functionality be necessary.
approach
gorilla's css tokenizer is a useful starting point - https://github.com/gorilla/css and it is a reputable and community accepted source
unfortunately, you really need a parser built on top of the tokenizer to do this work. building a simple one is fairly trivial, but only for this use case (declarations in style attributes). If you want to sanitize inline css in <style>
tags, or external css, then the task becomes much more difficult.
https://github.com/aymerick/douceur has a parser built on top of gorilla's tokenizer that purports to accomplish this, but it's unclear how well supported it is or how high the quality is… one issue points out an infinite loop
fortunately, it has an acceptable implementation of declaration parsing for our use case: https://github.com/aymerick/douceur/blob/master/parser/parser.go#L163-L201
given that, it seems like it's worth using for now. if the scope of css sanitization increases, then it may be worth re-evaluating options on whether starting fresh, forking, or contributing back upstream are better alternatives
Nothing is ever as easy as it seems. There's no package manager in the project, I'll have to bring one in. Hopefully there are no objections to https://github.com/golang/dep
I'm going to put some thought into what the interface for a fully integrated CSS sanitizer might be.
This may take me a while as I'm flying around a bit right now, but should have a proposal within a couple of weeks.
It will likely follow the implementation style of the proposal PR you had, but allow for full configurability of a policy for CSS on a per entity/tag basis.
Have you made any progress on your proposal for CSS sanitising, @buro9? Even a basic level of CSS support (for style
attributes, as per @gerad's proposal) would help tremendously.
@buro9 My team currently uses bluemonday for sanitizing inserted html styled with css, and we would like to make changes to fix this issue.
Currently I think it makes sense for the API to add something along the lines of the following:
AllowStyles(string[])-allows all listed style properties
Matching(regex) adds handler to style that checks if it matches regex
MatchingEnum(...string) adds handler to style that checks if it matches list of enums
MatchingHandler(function(string)bool) adds handler to style that takes in string representing the property value and returns if it is valid
Globally() adds style policy to all elements
OnElements(...string) add style to specified elements
I was thinking each CSS3 property could have a default handler that prevents all/most values that can be overwritten with MatchingHandler().
This doesn't prevent security issues, but the user would have to specifically whitelist styles and override the default handler in order to cause issues.
How does this sound to you?
That sounds like a really good API for this. I'd approve and merge PRs that followed that API.
Merged, thanks for the great addition.