Proposal: Clarify the "Why" for Aggressive HTML Attribute Encoding
I would like to propose a small clarification to the XSS Prevention Cheat Sheet, specifically regarding the section on HTML Attribute Encoding.
Encoding Type: HTML Attribute Encoding Encoding Mechanism: Encode all characters with the HTML Entity &#xHH; format, including spaces, where HH represents the hexadecimal value of the character in Unicode. For example, A becomes A. All alphanumeric characters (letters A to Z, a to z, and digits 0 to 9) remain unencoded.
The Problem: This text clearly explains what to do, but it doesn't explain why this aggressive encoding is necessary over a framework standard function such as htmlspecialchars. A developer reading this is likely to see it as unnecessary or too much work and ignore the advice, because the risk aren't explained. They might think, "Why not just escape the quotes? That should be enough." There are also many misunderstandings, like the ones I have faced so far.
Why space is encoded: Developers often ask, “Why should I encode spaces as instead of leaving them as a space?” Answer: Because unencoded whitespace can break out of the attribute value or introduce new attributes.
Why non-alphanumeric encoding is necessary: Developers think encoding < and > is enough. Answer: Quotes, backticks, or even = can break attribute syntax and allow injection.
Why the context matters: They don’t realize that attribute encoding is stricter than element-body encoding.
The Proposal: I propose adding a brief, clear explanation directly after this rule. The explanation would cover the crucial concept of multiple parsers (HTML, JavaScript, URL) acting on a single attribute, using a concrete example (like an onmouseover event handler) to show how framework inbuilt functions like htmlspecialchars are insufficient and why aggressive encoding is the only safe solution.
I am happy to create a pull request with the real-world developer centric use-cases, if this proposal is acceptable.
Note: This problem is framed based on my experience with developers, though not all developers may share the same concerns.
I think this is a great idea. We should make sure to keep it concise.
@ajayojha Do you want to submit a PR?
Yes.
I'm surprised that @jmanico hasn't objected to this. While the OWASP Java Encoder Project's Encode.forHtmlUnquotedAttribute encodes spaces, the more commonly used Encode.forHtmlAttribute does not. In reality, if a developer is astute enough to know they need to use the formHtmlUnquotedAttribute on unquoted attributes, they almost certainly will be wise enough to know that they should quote attribute values in the first place. My bigger problem, it is not clear to me that all libraries that attempt to do properly encode for HTML attributes that are not JavaScript event handlers encode spaces. Most of those libraries probably don't quote spaces but tell the devs they need to quote the attribute values. I don't think that a change to this CS should in any way be suggesting that existing libraries change that, because then there's going to be 2 variations...old versions of the library before the change and new versions after the change. I think that there has to be some onus on the devs using the library. Almost every JavaScript attribute that I've seen in the untold hundreds of code reviews that I've done in the past 5 or so years, the devs have been really strict at quoting the attribute values, so I don't think it's a major problem for libraries to require it to be so. Therefore, if you (@ajayojha) want to do anything in terms of explaining this (which I'm all for), instead I would suggest we emphasize the need to place the HTML attribute values within single or double quotes.
That said, lest you think that I am only saying this because ESAPI's Encoder.encodeForHTMLAttribute does it wrong, the only "immune" characters (other than alphanumeric) that are not encoded by encodeForHTMLAttribute are comma, period, hyphen, and underscore.
Thank you for the feedback. I agree with you. The number-one rule that developers must follow is to always quote their attributes when dynamic values are involved. My proposal was never meant to suggest that existing libraries are “wrong” or should change their behavior. My original proposal is focused on the risk of multi-parser interpretation that can occur even inside a perfectly quoted attribute. Library functions like Encode.forJavaScript are excellent, but developers may mistakenly use Encode.forHtmlAttribute or htmlspecialchars in JavaScript contexts like onclick. My example highlights this risk, supporting the need for a Cheat Sheet clarification on when to use aggressive encoding or JavaScript-specific escaping. So, my intention with this PR is not to defend or prescribe how libraries should behave, but rather to help developers understand the context and make the correct decisions accordingly.
Kevin, I have been in class all week teaching, but I - very much - agree with all of your comments.
Attributes need to be quoted, and then spaces are not problem. I never use forHtmlUnquotedAttribute - that was only used for automated legacy remediation, I do not think a developer should ever use it. Just quote your HTML attributes and less escaping is needed!
Thanks @kwwall and @jmanico for noticing that, sorry I missed that point! I'll close this issue for now.
The proposal focused on the multi-parser threat inside quoted, high-risk attributes like onclick. I would like to know the correct reason for closing this ticket. Was the conclusion that this specific scenario is outside the scope of the Cheat Sheet, or that the existing guidance is already considered sufficient, or something else?
I re-opened this ticket.
Was the conclusion that this specific scenario is outside the scope of the Cheat Sheet, or that the existing guidance is already considered sufficient, or something else?
The conclusion is that the advice you are giving is not universally necessary or accurate, I submit with respect.
I do not mind an edit, but again, spaces do not need encode spaced in an attribute context most of the time. This is only when you are not quoting attributes, which is a really rare occurrence. So typically none of us recommend encoding spaces for most content and contexts.
If you want to give us an example of where we have made mistakes in our analysis, the ticket is re-opened and we are open to more examples or explanation from you.
Perhaps you an give us one or two real world examples here?
Thank you for re-opening the ticket and for the open discussion to provide more real-world examples.
The proposal is focused on multi-parser context threat inside quoted, high-risk attributes like onclick etc
Encode except alphanumeric characters is highly align with the Salesforce Developer Guide on XSS and the package called 'secure-filters' they have developed for encoding.
The Salesforce's official developer documentation states that "Auto-HTML encoding is not sufficient when passing through multiple parsing contexts". here is the screenshot and My proposal is fully aligned with this note in technical form.
The secure-filters package focuses on output encoding (specifically HTML entity-encoding, JavaScript escaping, etc.) rather than sanitization in the sense of removing content. the goal is to make data safe for inclusion in specific output contexts (HTML, JavaScript, CSS, URLs) by encoding special characters. For example
- In HTML attributes, spaces are not encoded.
- In event handler attributes (jsAttr), spaces are encoded.
Please refer to the spec code of the pacakge, here is the screenshot:
This is different from many popular libraries, which often encode only a small set of special characters. By contrast, secure-filters encodes more than they encode, which aligns with the highlighted cheat sheet guidance about HTML Entity Encoding.
Salesforce follows a more security-centric approach. It's good to add some explanation in a way that in which cases the agressive encoding is good enough and in which cases the contextual encoding is enough.
From my experience, developers often use the same encoding function for all HTML attributes, which is incorrect. The proposal aims to clarify when to use what, with real-world cases. Here is the one of the example of vulnerability (CVE-2018-10061) caused by misuse of encoding functions.
Kind Note:
- Salesforce’s secure-filters clearly shows that spaces are encoded in html event handler attributes, which differs from the common advice.
- There is still a significant amount of public code on GitHub where developers set unquoted attribute values dynamically. I can share examples if useful.
- The current cheat sheet does not clearly emphasize the “always use quotes” rule for attributes (please correct me if I missed it). It would be beneficial to state this explicitly, "For all HTML element attribute values, always wrap the value in quotes (single or double)."
- The information of unquoted attribute values is not clearly covered in the cheat sheet. Both Salesforce’s XSS documentation and Apache’s XSS encoding examples documentation have information about unquote.
My intention is to help developers understand context and avoid multi-parser pitfalls. Given the information (Salesforce practices, secure-filters, CVEs, real-world commits), I believe this clarification is both necessary and accurate. If team needs more clarification and real-world example, I would be happy to answer and share the examples. If the team still feels it is not universally applicable or accurate, I am happy to close the ticket.
Note: If the proposal is accepted then I would be happy to share the final content here before creating the PR to get the high-level feedback from the team.
Any feedback for me?
@ajayojha - Let me back up first to a previous comment that you made:
My original proposal is focused on the risk of multi-parser interpretation that can occur even inside a perfectly quoted attribute. Library functions like Encode.forJavaScript are excellent, but developers may mistakenly use Encode.forHtmlAttribute or htmlspecialchars in JavaScript contexts like onclick. My example highlights this risk, supporting the need for a Cheat Sheet clarification on when to use aggressive encoding or JavaScript-specific escaping.
My apologies for missing this "multi-parser interpretation" context in my original feedback. I got as far as your proposal for what you wanted to do with HTML Attribute Encoding and ended up glossing over the rest of it.
However, any suggestion on a multi-parser interpretation implies that output encoding involved is going to require output encoding in mixed / nested contexts; that is that multiple different encoders are required and usually must be applied in a specific order. In our current XSS Prevention Cheat Sheet, we no longer explicitly mention nested contexts; however, in a much older version of the XSS Prevention Cheat Sheet used to make explicit reference to this (but for reasons unknown to me, advice how to handle those situational edge cases were removed). Note however was long before I was part of the Cheat Sheet team doing regular reviews for them so I am unaware of the historical reasons. The earliest version of that approach that I found at cheatsheets.owasp.org was from July 16, 2019 at https://web.archive.org/web/20190716105540/https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html. If you search for "nested" on that page, you will see the approach was a bit different, but it was generally advised as something to be avoided.
As an independent AppSec engineer (i.e., not speaking officially for OWASP on this topic, even though some took it as official OWASP advice given that the question often arose in the context of OWASP ESAPI), I used to regularly advise developers that if a situation arose where you had such a multiple parser interpretation in code (e.g., for instance, processing URL query parameters inside of JavaScript) that the best approach was to rewrite your code so that the multi-parser situation would be completely avoided in the first place because getting the multiple nested output encoder ordering just right was extremely difficult and there it was always possible that you run across some popular browser that wasn't really W3C-compliant that would do its overall parsing slightly differently (looking at you Internet Explorer) and thus would require a slightly different output encoder nesting arrangement. Thus my general advice was either re-write it so multiple encoders weren't required or test it on every possible browser that you had to support and combine it with very strict white-listing data validations.
Even though our current XSS Cheat Sheet doesn't explicitly mention nested output encoders, I think the general advise of the team would be to avoid the situation and rewrite the code when possible. But the situation you describe much more complex than the original paragraph that you proposed that you envisioned would address it for this issue if it is to be generalized to situations involved cases where multi-parsers are involved as those require multiple nested output encoders (such as described in the Salesforce document) or telling developers to rewrite their code to eliminate those cases. If we are going to try to address it, then I think we should address it in a new XSS Prevention Cheat Sheet that covers the use of nested encoders. And that is a much larger and very complex issue. I'm not sure we want to go there because devs would interpret it as it's perfectly okay to allow user input where in involves multiple contexts.
@jmanico, @szh, @mackowski - What do all of you think?
I think rewriting is not always practical. From my experience, "always rewrite" is rarely the solution in many applications, because of legacy code, Third-party dependent libraries, Framework constraints. I believe rewrite guidance is more of an application architecture decision than a security solution. From security cheat sheet prespective, I think the focus should be on guidance framed as warn, avoid, apply, so developers can understand the risks and safer alternatives.
There are few points in your reply that I would like to better understand:
- Could you clarify how the situation I described is "much more complex"?
- Do you still see the parsing inconsistencies in ever green browsers, as IE is long past.
- devs would interpret it as it's perfectly okay , do you mean ignoring the guidance altogether? If so, does ignoring the advice reduce the real-world risks we see in CVEs where improper encoding functions (e.g., htmlspecialchars without ENT_QUOTES) led to vulnerabilities?
- "much larger and very complex issue" - Would be great if you share some more information in the context of why.
@ajayojha wrote:
I think rewriting is not always practical. From my experience, "always rewrite" is rarely the solution in many applications, because of legacy code, Third-party dependent libraries, Framework constraints. I believe rewrite guidance is more of an application architecture decision than a security solution. From security cheat sheet prespective, I think the focus should be on guidance framed as warn, avoid, apply, so developers can understand the risks and safer alternatives.
Oh, I agree. Rewriting such code is hardly ever practical, but it's almost always the right thing to do. All these reasons you gave generally also apply to inline
There are few points in your reply that I would like to better understand:
Could you clarify how the situation I described is "much more complex"?
I'll give it a shot. Your original (summary?) proposal of:
Encoding Type: HTML Attribute Encoding Encoding [sic] Mechanism: Encode all characters with the HTML Entity &#xHH; format, including spaces, where HH represents the hexadecimal value of the character in Unicode. For example, A becomes A. All alphanumeric characters (letters A to Z, a to z, and digits 0 to 9) remain unencoded.
was the paragraph that I was referring to here in the sense of being "much more complex". In comparison to your multiple parsers, IMO, the nested encoding is at least an order of magnitude simpler than describing the all the nested variations that you then go on to describe. Note that in addition to all the multiple parsing that you mention, I am also concerned that things such as various JavaScript templating engines and things like JSP Expression Language, etc. could further complicate this.
So, yes, I think it is much more complex. (That doesn't assume that it's a pointless exercise, but just that that such complexity should probably deferred to a separate cheat sheet.)
Do you still see the parsing inconsistencies in ever green browsers, as IE is long past.
I haven't since MSIE is (mostly) dead. (I presume there still plenty of idiots out there somewhere using unsupported versions on unsupported Windows OS, but if they are doing that, they have much bigger problems than XSS.)
And as long as any new developed web browsers use well-established browser engines such as Gecko, WebKit, or Blink, it probably won't be a problem. in the future. But a look at https://en.wikipedia.org/wiki/Comparison_of_browser_engines shows history doesn't always work that way. Someone always has to develop their own mousetrap. So, eventually, I'm confident that it will happen again in some niche browser. There probably inevitably be an HTML 6 someday as well, with more chances to add complexity and buggy code.
devs would interpret it as it's perfectly okay , do you mean ignoring the guidance altogether? If so, does ignoring the advice reduce the real-world risks we see in CVEs where improper encoding functions (e.g., htmlspecialchars without ENT_QUOTES) led to vulnerabilities?
To put it more simply, if you tell a lot of devs (unless they work for companies with a very strong security culture, which is not the norm) that you should "normally rewrite your code when it invokes such multiple parsers, unless you absolutely can for reasons X, Y, or Z, in which case you can follow these more complex rules and use nested encoding", then I'm convinced that significant portion of those devs will convince themselves that they have reasons X, Y, or Z and just ignore the mainline advice and attempt nested encoding and then inevitably shoot their own leg off. There is significant (I'd say, at least 10%) of devs that choose the wrong context when only a single encoding is needed. (E.g., most common is they use HTML attribute encoding for JavaScript event handlers or for 'src' or 'href' attributes.) So, I prefer not to give them the option to follow the "Here Be Dragons" path. Some of those who do will even try to convince their managers that "It was in the OWASP XSS Prevention Cheat Sheet", implying that it was okay. I don't think we want to give devs an excuse to use a technique that is almost certainly going to come back and explode in their face. If my estimate of 10% of the applied output encoding is using the incorrect context, how much does that figure go up when developers start to use multiple nested output encoding?
"much larger and very complex issue" - Would be great if you share some more information in the context of why.
I can't share specific examples, because:
- Truth be told, they are no longer that common. (I think most, if not all, of those Salesforce examples were contrived. In fact, I would conjecture that if devs who have strong AppSec teams saw that their devs wrote code the manner that the Salesforce examples was written, their AppSec teams would insist that those developers be forced to write code while wearing mittens for the next 6 months.)
- Any recent examples that am currently aware of (there are a few; mostly variations of processing URL query parameters in a JavaScript context) are proprietary and therefore I am unable to share them.
But the main complexity I am thinking of is that when multiple parsers are involved, you need to know the proper nesting order. If you nest them in the wrong order, it won't save you from XSS.
That said, @jmanico, @szh, and @mackowski think its wise to break out nested encoding into a separate XSS Prevention Cheat Sheet that has a title that will warn people of the inherent dangers, I think we can do so if we're very careful as it might be helpful in specific edge cases. Knowing human nature (specifically, vanity and laziness), I still think it probably will have a net negative impact and as am AppSec engineer, I would never allow it except as a short term fix. But if we name is something appropriate such as:
Questionable Techniques for Preventing XSS
and erect some other reasonable guardrails to clearly make it a suggestion of last resort, then sure, if your up to it. But keep in might that a lot more work than the simple change that you initially expressed of:
Encoding Type: HTML Attribute Encoding Encoding [sic] Mechanism: Encode all characters with the HTML Entity &#xHH; format, including spaces, where HH represents the hexadecimal value of the character in Unicode. For example, A becomes A. All alphanumeric characters (letters A to Z, a to z, and digits 0 to 9) remain unencoded.
Anyway, cathartic as this exchange has been, given that I have had a recent death in my family that needs attending to, I prefer to allow @jmanico, @szh, and @mackowski time to share their perspective before I pipe in again (which likely won't be for at least a week).
@kwwall Thank you so much for your extensive responses on this. I'm terribly sorry to hear about your loss!
@jmanico and @mackowski I'll defer to you on this decision since I'm no longer as up to date on frontend security as I once was, but I just want to keep the perspective that these are cheat sheets, and should address the majority of cases, not necessarily every eventuality that may come up. If we're talking about a very niche case, it may be ok to just leave it unaddressed.
but I just want to keep the perspective that these are cheat sheets, and should address the majority of cases, not necessarily every eventuality that may come up. If we're talking about a very niche case, it may be ok to just leave it unaddressed.
I agree 100%. We want brief actionable guides. And I know its hard. It's actually a lot harder to write a concise cheatsheet vs. a verbose one, but that is our goal. Kevin is doing a great job in his commentary and we are close to agreeing on a small edit to this cheatsheet.
Nice conversation everyone! Well done!