envoy
envoy copied to clipboard
OAuth2 filter fix handling callback url with Non-ASCII characters
Commit Message: Fix handling urls with Non-ASCII symbols on OAuth2 filter Additional Description: http_parser library was replaced by GURL library in the OAuth2 filter for valudation url Risk: Medium Testing: I have tested manually. As for me for testing changes it's enough current oauth2_filter tests Release notes: Fixed processing callback url with Non-ASCII characters in the OAuth2 filter Fix issue #23167
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/)
.
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/)
.
Can you provide more context around this change? This is huge and it sounds quit scary from a security perspective. Thank you.
/wait
It's very strange why it's showed so many commits and changed files. I have changed only 2 files. May be I'll recreate PR
It's better now. You can look one commit with a small changes.
I only replaced using http_parser library on GURL library in pieces of code where there is a parsing of state query parameter from the callback url.
http_parser library cannot parser string to url object if the url contains Non-Ascii characters. This brokes a successfull filter flow.
I rebuilt an envoy binary and published new envoy version in production environment of my company. The fix works great. I'am looking forward someone takes a look the PR. Is anyone going to take a look? Thanks
But non-ASCII characters are not allowed in the URI path accprding to https://datatracker.ietf.org/doc/html/rfc3986#section-3.3 ?
Are you trying to fix this for an application that sends malformed URI in the request?
/wait-any
@yanavlasov , No, no. Did you see a bug description?
I agree that non-ASCII characters are not allowed in the URI path. But all Non-ASCII symbols encode by URL encode according the same spec(https://datatracker.ietf.org/doc/html/rfc3986#section-2.1) . Any application can send Unicode characters in URI using by HTML URL Encoding. My application sends all right.
The OAuth2 filter receives URI from only ASCII symbols. But after decoding URL to string in the filter, the string already contains Non-ASCII characters, they were decoded. And in this moment, http_parser library already cannot parser such string properly as url. GURL library is OK. It works good with URL with Non-ASCII characters. We pass a decoded URL to the method of library.
I see. This is a bug in the filter. the Http::Utility::PercentEncoding::decode
decodes every single character, but in fact only characters from the reserved and unreserved character sets should be decoded, if I understand it correctly.
I do not think this is the right change, since it will produce an incorrect redirect URL with unescaped non-ASCII characters.
The right fix is to add a new Http::Utility::PercentEncoding::decodeURL
function that only decodes characters from the reserved and unreserved sets and leaves all other encoding sequences as is.
/wait-any
I don't mind your suggestion. But I am not clear how to implement such function. Do you have a chance to create such function? I can just revert my changes and replace to your function and test after that
Also note that in the filter implementation we compare URL object with URL from setting of the filter. If URL will contain non-decoded characters in this case user need to know about it and have to set HTML encoding URL in the filter settings. I think it's no user friendly.
I unfortunately do not have time to work on fixing this, but I can help you implement it.
You can see how the Http::Utility::PercentEncoding::decode
function is implemented here: https://github.com/envoyproxy/envoy/blob/8285fca7c3da4b0ab9aa35e3febab659164ba987/source/common/http/utility.cc#L1024
The Http::Utility::PercentEncoding::decodeURI is the same but with additional check here https://github.com/envoyproxy/envoy/blob/8285fca7c3da4b0ab9aa35e3febab659164ba987/source/common/http/utility.cc#L1045
to see if ch
is in either reserved or unreserved sets according to RFC: https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
if it is - it should be decoded, if it is not in either of these sets ch
should be set to %
and i += 2;
is skipped.
Then you can add unit tests similar to how it is done here: https://github.com/envoyproxy/envoy/blob/8285fca7c3da4b0ab9aa35e3febab659164ba987/test/common/http/utility_test.cc#L1614 in you case you can just decode a string with every ASCII character %encoded and check that only expected characters were decoded.
We can then go through making changes in the o-auth filter.
I have also added current owners @derekargueta @snowp of this filter, in case they would want to take this on.
@yanavlasov , could you provide test input and output string for example? I am not sure that I clear undestand you what I need to do. It's better to have an example
The standard allows query to contain: unreserved | pct-encoded | sub-delims | : | @ | / | ?
This allows 4 characters out of the gen-delims set to be unencoded and only requires the #[]
characters to be encoded.
The bigger question is if the redirect URI also contains query string. I do not know if OAuth allows query and fragment in the redirect URL. If it does, it means that the key/value separators in the redirect URL also have to be %-encoded to prevent incorrect interpretation.
If query and fragment are allowed in the redirect URL and assume that query key/value separators are = and &, then the test strings are:
input: %00%01%02 ...<replace ... with %-encoded sequences for all intervening characters> %FE%FF%
output: %00%01%02...%22#%24%25&%27...%3C=%3E...%5A[%5C]%5E...%FE%FF%
This is my best approximation for how URL should be encoded as part of a query parameter.
/wait
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
I dont' have time to reedit changes. I hope I will find a time for this
You suggest to create a special decode function. I think there are already two function to encode and decode URL and they works excellent. Why we need to create third function for URL? You suggest to make a partial decoding in new function. It sounds strange.
Beside that, I don't produce non valid URL. The filter decodes the URL. The Http::Utility::PercentEncoding::decode function works right according specification. After decoding the redirect url is true by structure. GURL library confirms that this is a valid URL unlike the Http::Utility::Url. We works with decoded URL in the filter. The redirect url will be encoded back by the filter at the end.
The GURL library is recieved just a plain URL. The url is not needed to be encoded. The Http::Utility::Url library also must works with any decoded URL true by structure, but it's not really. And it's main problem
@yanavlasov, @mattklein123, I think my changes are perfect and fix the problem. I have told above why I think so. And hope envoy will transfer to GURL library as soon as possible
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
I am busy to actualize code
This PR may be closed in favor of the #25278