esapi-java-legacy icon indicating copy to clipboard operation
esapi-java-legacy copied to clipboard

ESAPI's default regex for HeaderValue has a minus-sign bug

Open xeno6696 opened this issue 4 years ago • 18 comments

^[a-zA-Z0-9()\\-=\\*\\.\\?;,+\\/:&_ ]*$

should be

^[a-zA-Z0-9()\\\-=\\*\\.\\?;,+\\/:&_ ]*$

xeno6696 avatar Nov 11 '20 02:11 xeno6696

@xeno6696 - You left off the '*' just before the '$' at the end. That's pretty important! ;-)

I'm guessing this regex was created as if you were going to put it into actual Java code as a String literal, hence the extra '' characters.

kwwall avatar Nov 11 '20 05:11 kwwall

I didn't complete the end of the line in the editor lol... it's fixed now.

xeno6696 avatar Nov 11 '20 13:11 xeno6696

Also, I believe inside a character class, if '-' is the last (and perhaps the first as well; I'd have to look it up), then it doesn't need to be quoted. E.g., "[0-9-]" would allow any single decimal digit or the '-' character, which you might want to use to create a regex for matching SSNs or CC#s, etc.

kwwall avatar Dec 05 '20 01:12 kwwall

Is this really a bug in ESAPI? If not, can you close it. If so, can you confirm it, and add a 'bug' label to it?

davewichers avatar Jan 23 '21 22:01 davewichers

A 'bug' in the sense that it likely wasn't intended. Seems unlikely that we would generally allow a leading '-' in an HTTP header value, although AFAICT, it should be harmless and it's probably allowed as per RFCs and/or W3C specs. (Haven't checked that because there are so many of them it's too hard to keep up.)

@xeno6696 - Please take another look and see if this needs fixed. If so, give me the word and I'll save you a PR since I am working on another release now.

kwwall avatar Jan 26 '21 01:01 kwwall

I posted the fixed version above: Just needs that extra escape. I didn't commit it anywhere, so given that it's a 1-char change just do it.

xeno6696 avatar Jan 26 '21 02:01 xeno6696

Whether or not we WANT minus signs in the header is a secondary issue: As originally intended we obviously wanted to allow minus signs, we just didn't escape it properly.

If you want Kevin, I could run all the rest of the regex through my debugger and let you know if any others are broken? Or that could be done later too.

xeno6696 avatar Jan 26 '21 02:01 xeno6696

When I want a minus sign in the character class, I generally place it as the first character where it it treated as a literal '-'. That said,

If you want Kevin, I could run all the rest of the regex through my debugger and let you know if any others are broken? Or that could be done later too.

That said, that would be good, especially if your regex check examines to see if the regex pattern is subject to ReDOS (regular expression denial of service).

kwwall avatar Jan 26 '21 03:01 kwwall

Oi... so far it affects every regex that has a minus symbol in the character class. I'd say something snarky but I clearly didn't catch it myself and I've been in and out of this file for years now =-/

I've fixed about 5 of them.

What I will do is fix all of them and then email you the fixed file.

xeno6696 avatar Jan 26 '21 03:01 xeno6696

Emailed @kwwall

xeno6696 avatar Jan 26 '21 03:01 xeno6696

As long as adding in a leading '-' doesn't break any tests, I'm fine with putting it in. If we do so though, jam it in as the first character, e.g., ^[-a-zA-Z0-9()=\\*\\.\\?;,+\\/:&_ ]*$ rather than ^[a-zA-Z0-9()\\\-=\\*\\.\\?;,+\\/:&_ ]*$ which is much harder to read and to get correct (obviously!). If it breaks tests, then despite what the original regex looked like, I'd say that the intent was not to allow '-' as a leading character.

kwwall avatar Jan 26 '21 03:01 kwwall

I just escaped the symbol. According to the debugger we were interpreting character ranges wherever it appeared.

xeno6696 avatar Jan 26 '21 03:01 xeno6696

If you want it that way I'll do a round in the morning :-)

xeno6696 avatar Jan 26 '21 03:01 xeno6696

That would be my preference. It might keep someone from undoing the change later.

-kevin

On Mon, Jan 25, 2021, 10:31 PM Matt Seil [email protected] wrote:

If you want it that way I'll do a round in the morning :-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESAPI/esapi-java-legacy/issues/579#issuecomment-767267035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO6PG7HDX2VTC43LVEH4PTS3YZPTANCNFSM4TRNO45A .

kwwall avatar Jan 26 '21 03:01 kwwall

From @kwwall 's email:

Update GitHub issue 579 and list all the properties in ESAPI.properties and validation.properties that had this problem that you have fixed.
Review these values and make sure that a '-' really should be allowed in these regexes. (We discussed it for Validator.HTTPHeaderValue but not really for any of the others that you changed.) And since we obviously had no test cases (well, at least not correct ones), they wouldn't have shown up in failed tests.
Write test cases for the regexes that we end up changing.

I think 1 and 2 are mandatory and 3 is advised, but I am interested in hearing others thoughts on this.

But the bottom line is, I am inclined to take a more cautionary approach and not just try to jam this into the next ESAPI release without thinking this through a bit more first. I don't think there's any particular rush for this. It only popped up because @davewichers was looking through the GitHub issues and asked if it was really a bug in ESAPI. (I'll bet Dave is the kind of guy who also has a neat desk! :)

Sometimes my drive to be incredibly concise does so at the expense of abandoning all context. Whichever issue this originally arose in, I had stated that my purpose was to cycle through and find the error throughout the files, which is not communicated anywhere in THIS issue except for when I started that part of the conversation. My Bad.

xeno6696 avatar Jan 28 '21 14:01 xeno6696

image

image

In validation.properties image

All files at present from /src/test/resources/...

xeno6696 avatar Jan 28 '21 14:01 xeno6696

@kwwall I'd say we take them a few at a time:

Validator.SystemCommand: I'd say the minus needs to stay unless we want to block flags.
Validator.HTTPServerName: Every place I've worked has had hostnames with hyphens. Validator.HTTPCookieName:
Validator.HTTPCookieValue:

Complicated but I researched this.

5.2.2. The Max-Age Attribute

If the attribute-name case-insensitively matches the string "Max- Age", the user agent MUST process the cookie-av as follows.

If the first character of the attribute-value is not a DIGIT or a "-" character, ignore the cookie-av.

This collapses up to Headers and Values generically meaning

Validator.HTTPHeaderName Validator.HTTPHeaderValue are also both legal candidates for the "-" symbol.

Will pause for comments.

xeno6696 avatar Jan 28 '21 22:01 xeno6696

Well, the problem, at least for Validator.HTTPServerName, is that AFAIK, valid host names cannot begin nor end with a hyphen and I'm pretty sure that they can't start with a period. So that regex at least, needs to be rewritten. (If we are assuming that this is trying to match a FQHN, then those can end with a '.'; e.g., "www.example.com." is a valid FQHN.

Likewise, there is a question as to whether an HTTP header name can start with a '-' or note. I've never seen one that can. Header values, sure. I can't image that's not allowed. But names are usually identifiers and identifiers typically are not allowed to start with a '-'. (Finding a complete, readable BNF for HTTP 1.1 would be helpful.) So I suspect that we may need to rewrite the regex for Validator.HTTPHeaderName as well. (A cookie name OTOH, is really part of an HTTP header value, so I'm okay with that one.)

kwwall avatar Jan 29 '21 01:01 kwwall