dspace-angular icon indicating copy to clipboard operation
dspace-angular copied to clipboard

Option to force HTTPS in origin

Open LotteHofstede opened this issue 3 years ago • 2 comments

References

  • Fixes #1721

Description

This PR adds a new configuration parameter forceHTTPSInOrigin to force to return the origin URL with the https:// protocol. This PR also uses the location.replace() function to redirect and make sure the browser can keep track of the browse history.

Instructions for Reviewers

In the environment variables, in the list of ui configurations set forceHTTPSInOrigin to true. Test that the citation_pdf_url on an item page and make sure it always returns a URL that starts with https://.

To test the redirect change, download a file, and use the back button of your browser. You should end up back on the item page

Checklist

  • [x] My PR is small in size (e.g. less than 1,000 lines of code, not including comments & specs/tests), or I have provided reasons as to why that's not possible.
  • [x] My PR passes TSLint validation using yarn run lint
  • [x] My PR doesn't introduce circular dependencies
  • [x] My PR includes TypeDoc comments for all new (or modified) public methods and classes. It also includes TypeDoc for large or complex private methods.
  • [x] My PR passes all specs/tests and includes new/updated specs or tests based on the Code Testing Guide.

LotteHofstede avatar Jul 15 '22 12:07 LotteHofstede

is this something that we should really manage on the dspace side? to force a redirect to https (that is a best practice that we always enforce with an HSTS header) there are specific directive on the web server / balancer / gateway that will be put in front of node in any production deployment. I'm just not sure that we should implement our own solution for area that are under the responsibility of other component. I would like to hear other opinions on that

abollini avatar Jul 23 '22 20:07 abollini

The goal is not to force a redirect, the goal is to solve the problem that if there is something in front of the node server that redirects from a public https url to an internal http url, that the code we use to determine the origin server side has no way of knowing that. As a result, things that use that getCurrentOrigin method, such as the citation_pdf_url meta-tag get the wrong protocol.

We added this forceHTTPSInOrigin boolean rather than a whole separate "public" UI section with the entire URL, because I think two UI sections would be confusing, and we are able to automatically determine the host.

I considered defaulting it to true as using HTTPS is indeed best practice, but that would mean it wouldn't be automatically backwards compatible.

artlowel avatar Jul 25 '22 07:07 artlowel

@atarix83 or @abollini : Have either of you had a chance to glance at this again? I'd like to get this merged soon if possible, as it's already at +1.

tdonohue avatar Sep 07 '22 17:09 tdonohue

The goal is not to force a redirect, the goal is to solve the problem that if there is something in front of the node server that redirects from a public https url to an internal http url, that the code we use to determine the origin server side has no way of knowing that. As a result, things that use that getCurrentOrigin method, such as the citation_pdf_url meta-tag get the wrong protocol.

We are still unable to reproduce that. We have exactly this kind of setup on many of the installations that we manage, a reverse proxy that deal with the HTTPS talking with an internal Angular/Node server over plain HTTP. Checking the metatag we see that in all the 7.2, 7.3 instances both the citation_abstract_html_url than the citation_pdf_url in HTTPS.

On an 7.1 installation we instead see that the citation_pdf_url is in HTTPS but the citation_abstract_html_url is in HTTP, we need to check deeper if this was a bug in the old version or some misconfiguration in the deployment.

abollini avatar Sep 15 '22 13:09 abollini

@abollini : This issue is reproducible on the demo site right now. Here's a page where the citation_pdf_url shows HTTP on the first load: https://demo7.dspace.org/entities/publication/1974d7d3-9648-4dbe-b083-56374f5314e4

The issue is specific to when the page is rendered on the server side. If you look at client side generated content, this bug will never appear. However, if the page is loaded via SSR (which is highly likely when search engines are indexing), then this bug will appear.

tdonohue avatar Sep 15 '22 13:09 tdonohue

@artlowel @tdonohue we check further and indeed the issue comes when the SSR kick-in. I was initially fooled by the reload that happen automatically in the browser, so it is much better test via the command line. That said, I'm still not convinced by the fix. The X-FORWARDED-FOR headers are used exactly to manage these scenarios and there is also a proto header here. I have done a quick search and found this npm library that is used to abstract the access to the request to get into consideration these forward headers if available. Can you take a look to that? could this provide a general solution that would not require extra configuration and will work with a proper configuration of the reverse proxy? https://www.npmjs.com/package/forwarded-for

abollini avatar Sep 15 '22 16:09 abollini

@artlowel : @abollini 's latest comment reminded me about the X-FORWARDED-PROTO: https header, which is also required to be set on the backend (REST API) whenever it is run behind a proxy. See this question in our Install docs

We may want to double check whether that header has any impact on the Angular/Node side as well. As it might be that, if we require the proxy to set X-FORWARDED-PROTO: https (just like on the backend), then our Angular code could just access that header to determine if it needs to use HTTPS or HTTP URLs.

tdonohue avatar Sep 15 '22 17:09 tdonohue

@abollini @tdonohue These x-forwarded headers are already present. I logged the request headers in the node server with the current httpd config on demo7.dspace.org and got the following:

{
  host: 'demo7.dspace.org',
  accept: 'text/html, application/rss+xml, application/atom+xml, text/xml, text/rss+xml, application/xhtml+xml',
  'accept-encoding': 'gzip,deflate',
  'user-agent': 'REDACTED',
  'x-forwarded-proto': 'https',
  'x-forwarded-for': 'REDACTED',
  'x-forwarded-host': 'demo7.dspace.org',
  'x-forwarded-server': 'dspace7-demo.atmire.com',
  connection: 'Keep-Alive'
}

The fact that they're present means they are not automatically taken into account when simply retrieving the protocol of the request.

@abollini The package you suggested wouldn't be a solution, it will give you the IP of the original requester, but says nothing about the origin or protocol of the public url

Something that could work is to test for the presence of those x-forwarded headers, and use their values instead of the default request properties for host and protocol if they exist. I've created https://github.com/DSpace/dspace-angular/pull/1850 which takes that approach. We should discuss which one to go ahead with in the meeting today.

artlowel avatar Sep 22 '22 11:09 artlowel

Closing, as this is replaced by #1850 (see discussion in that ticket)

tdonohue avatar Sep 27 '22 19:09 tdonohue