open-uri
open-uri copied to clipboard
Enhance security by removing Authorization header on HTTP redirects
When redirects occur, the Authorization header containing authorization information is transferred to the redirect destination host. This can expose sensitive credentials to unintended hosts, posing a security risk.
I will introduce a case where the Authorization header is unintentionally transferred during redirection.
Case
The issue occurs when downloading GitHub Actions Artifacts using the GitHub REST API.
- https://docs.github.com/ja/rest/actions/artifacts#download-an-artifact
The process is as follows.
- Authorize using the Authorization header.
- Redirect to a URL containing a Shared Access Signature (SAS) for downloading the artifact.
- Download the artifact.
During the redirect in step 2, the Authorization header from step 1 is retained and transferred to the redirect destination. This results in an authorization error due to the redirection destination misinterpreting the Authorization header content as the SAS.
Reproduce code
Here is a Ruby code example that demonstrates the problem, resulting in a 403 error because the Authorization header does not match the expected signature format.
require 'open-uri'
uri = ENV['ARTIFACT_URL']
access_token = ENV['GITHUB_ACCESS_TOKEN']
URI(uri).open("Authorization" => "token #{access_token}")
$ ARTIFACT_URL="https://api.github.com/repos/:owner/sandbox-github-actions-artifacts-v4/actions/artifacts/:artifact_id/zip" \
GITHUB_ACCESS_TOKEN="your_access_token" \
ruby download_artifact.rb
/home/kodama/.rbenv/versions/3.3.2/lib/ruby/3.3.0/open-uri.rb:376:in `open_http': 403 Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. (OpenURI::HTTPError)
Workaround
When redirecting without transferring the Authorization header content, the download succeeds without errors. This means the Authorization header is retained during redirects.
require 'open-uri'
uri = ENV['ARTIFACT_URL']
access_token = ENV['GITHUB_ACCESS_TOKEN']
headers = {
"Authorization" => "token #{access_token}",
redirect: false
}
loop do
begin
URI.open(uri, headers)
break
rescue OpenURI::HTTPRedirect => redirect
headers.delete("Authorization")
pp uri = redirect.uri
end
end
$ ARTIFACT_URL="https://api.github.com/repos/:owner/sandbox-github-actions-artifacts-v4/actions/artifacts/:artifact_id/zip" \
GITHUB_ACCESS_TOKEN="your_access_token" \
ruby download_artifact.rb
#<URI::HTTPS https://productionresultssa11.blob.core.windows.net/actions-results/xxx>
Expected Improvement
We aim to improve security by the Authorization header is removed during redirects if necessary, preventing its inadvertent transfer to the redirect destination host.
By the Fetch specification, it recommends removing the Authorization header on cross-origin redirects.
- https://fetch.spec.whatwg.org/#http-redirect-fetch
Supplemental Information
Examples of software or specifications that do not transfer the Authorization header on redirect:
Curl
Curl does not transfer the Authorization header on redirects and this issue is reported as a CVE.
- https://github.com/curl/curl/commit/af32cd3859336ab963591ca0df9b1e33a7ee066b
- https://curl.se/docs/CVE-2018-1000007.html
GO
Go removes the Authorization header when redirecting to different hosts but retains it if the redirect is to the same host.
- https://github.com/golang/go/blob/b3040679ad0eccaaadb825ed8c0704086ecc23eb/src/net/http/client.go#L41-L49
- https://go-review.googlesource.com/c/go/+/28930
Proposed Solutions
We are considering two approaches regarding the handling of the Authorization header during redirects.
- Always remove the Authorization header when redirecting.
- Remove the Authorization header when redirect to cross-origin.
This PR implements approach 2. based on the Fetch API specification, ensuring better compatibility and security standards.
Rebased with master branch
@akr Thank you very much for your maintenance efforts. When you have time, could you review this PR?
@akr Sorry to bother you while you're busy. If there is any missing information or unclear parts that would help in reviewing this PR, please let me know. I’m more than happy to prepare for anything needed to assist with the review.
I understand the motivation. We need to customize header fields for each request.
However, I feel that hard coding the field name "Authorization" is not a good idea.
How about adding a keyword argument to specify request-specific header fields?
uri.open(request_specific_fields: { "Authorization" => "token #{access_token}")
The hash specifies the header fields only before redirecting.
Note that the argument can be extended with procedures.
uri.open(request_specific_fields: -> (uri) do {...} end)
The procedure is called for each request and returns additional header fields for the request.
@akr Thank you for explaining the proposed method to customize header fields for each request using a hash or a block. I think the policy to allow customizable headers is a great approach. Could you help me clarify the specification about the following points?
Specify a Hash
When we pass a hash as request_specific_fields, my understanding is that this configuration is intended only for the initial request and does not persist through redirects. Is it correct to assume that any headers specified, including but not limited to the "Authorization" header, are automatically removed by the library for any subsequent requests?
Specify a Block
Regarding the use of a block, does the library evaluate this block with the new URL for each request, allowing for dynamic header customization at every step of the request process, including during multiple redirections?
Please let me know if my understanding is incorrect. Thank you for helping me.
Additionally, I have a suggestion regarding the default behavior for the "Authorization" header. Could you share your thoughts on this?
Suggestion about the Default Behavior for "Authorization" Header
Considering that users might not always remember to specify the "Authorization" header explicitly in request_specific_fields, do you think it would be beneficial to have a default implementation that automatically excludes this header from subsequent requests? This could prevent potential security issues by ensuring sensitive headers are not inadvertently sent across redirects unless explicitly included by the user. If a user wishes to customize this behavior further, they could override this default implementation by specifying request_specific_fields.
Your understanding of "Specify a Hash" and "Specify a Block" is correct. ("Block" is not an appropriate word here. "Proc object" would be better, though.)
I don't accept 'Suggestion about the Default Behavior for "Authorization" Header'. It is the responsibility of the users.
If such a mechanism is introduced in open-uri, someone may report that some fields (such as "Cookie" field) are not treated like "Authorization" field as a security problem. Since I don't know all HTTP headers in the future, I expect it to occur someday. I don't want to receive such a security problem report. Thus, I don't want to introduce the mechanism.
Thank you for clarifying the specification. I also understand your decision not to introduce the default behavior for the Authorization header.
With that in mind, I will proceed to implement request_specific_fields using specify a Hash and specify a Proc object based on the specifications we discussed.
@akr
Thank you for your guidance on the header customization using a Hash and a Proc object. I have implemented the request_specific_fields option.
Could you please review the implementation to ensure it meets our specifications?