flowfuse icon indicating copy to clipboard operation
flowfuse copied to clipboard

Make the embedded NR Editor work in a cross domain-setup

Open cstns opened this issue 1 year ago • 13 comments

Epic

#3646

Description

As a: user

I want to: Be able to use the embedded NR editor while the editor and app domains differ

The issue is easily reproduced in production where the app domain is .flowfuse.com and instance domains are .flowforge.cloud.

In order to access the embedded editor, use chrome, and from the instance overview page, replace the /overview path with /editor

Which customers would this be availble to

Everyone - CE/Starter/Team/Enterprise

Acceptance Criteria

I am able to access and manage the embedded Node-RED editor using any browser.

Have you provided an initial effort estimate for this issue?

I have provided an initial effort estimate

cstns avatar Apr 30 '24 16:04 cstns

Currently blocked by https://github.com/FlowFuse/nr-launcher/pull/230 and https://github.com/FlowFuse/flowfuse/pull/3801

@cstns is investigating whether there are workaround here, @cstns can you report here with a status summarising the explicit blockers, and what prospect (if any) we have of unblocking this please? I'm considering abandoning and moving to something else given the time it's taking.

joepavitt avatar May 08 '24 13:05 joepavitt

sure thing!

Spoke with @hardillb earlier this morning and he pointed me to the oauth lib that handles auth between the editor and ff.

I also stumbled across https://github.com/jaredhanson/passport/issues/938 in which the author of the lib kind of leaves the cross domain implementation part to the specific implementation of each use case.

I'm currently trying to circumvent the lib's inability to set the connect.sid cookie on cross domains.

ETA wise, I really can't give a precise estimation because i'm somewhat at a loss because debugging the nr-launcher and the entire oauth/passport.js lib is cumbersome.

I would greatly appreciate some support on it if possible

cstns avatar May 08 '24 13:05 cstns

I have a feeling that running the entire ff suite (and instances) with valid tls certificates locally would solve the issue, the set-cookie headers are being sent, but the cookie is not being persisted.

This may be due to the fact that the cookie's SameSite defaults to lax and is not taken into consideration by the browser. In order to be able to set the cookie, the SameSite attribute would need to be set to none, but that would need the Secure attr which in turn requires valid tls certs over https.

Unfortunately, the whole scenario cannot be replicated on the staging environment because of the identical domains for the FF app and instances and would just set the cookies from the start.

cstns avatar May 08 '24 15:05 cstns

@cstns lets get some time together this week to dig into this together. I've spent a lot of time in the depths of passport and the oauth flows. I just need to understand a bit more on where this is currently failing; we know we can do cross-domain authentication because that's how it already works when not being embedded in an iframe. I'm not yet up to speed on how the iframe part is breaking things.

knolleary avatar May 08 '24 16:05 knolleary

I would greatly appreciate it!

cstns avatar May 09 '24 07:05 cstns

This is what we're dealing with in a nutshell: image

cstns avatar May 09 '24 07:05 cstns

For future reference, this blog post describes the issue we're hitting very well: https://www.codeproject.com/Articles/5330276/Cross-Domain-Embedding-Making-Third-Party-Cookies

knolleary avatar May 09 '24 16:05 knolleary

@knolleary does that mean we can circumvent the problems?

joepavitt avatar May 09 '24 16:05 joepavitt

Status update:

After setting up a local docker installation that uses valid TLS certificates over HTTPS with different domains, I was partially successful in loading the NR editor in an embedded iframe in chrome.

By partially successful I mean that we overcame the browser's security policies by assigning the CSP headers which allowed the NR editor to load but got stuck in an authentication loop.

Might be related to: #2841, #2582, #1184 and NR #4684

NR #4684 might resolve the problem, we should check with NR v3.1.10 or v3.4 when they come out.

In the whole oauth process, the connect.sid's SameSite needs to be set to none on the NR side in order for the cookie to be picked up by the NR instance domain and complete the authentication callback.

Currently the cookies SameSite attribute defaulted to lax which causes a strange situation where the cookie is present on the FF domain (with the correct NR instance's domain set) but not on the NR instance domain.

image image

N.B my dev setup used the

  • *.docker.flowforge domain for the FF app
  • *.docker.flowfuse for the NR instance domain

cstns avatar May 10 '24 13:05 cstns

The linked NR issue regarding login loop will stop the loop, but only to show a login prompt in Node-RED. It won't address the core issue here related to the cookie options required to be set by Node-RED's session handling.

I will look at the Node-RED side on where this cookie is being set, and what would be needed to get the cookie options to be configurable.

We also need to figure out the impact all of this has of the localfs setup which doesn't use https - because all of this cookie handling is contingent on https usage. That will complicate localfs setups (including local dev setups - although there appears to be some leeway available when working purely on localhost domains).

knolleary avatar May 10 '24 14:05 knolleary

The impact for development will be quite limited, as you said, localhost is quite lenient specifically for these types of scenarios and localfs won't require valid https configuration unless the NR instances and FF app reside on different domains.

cstns avatar May 10 '24 14:05 cstns

unless the NR instances and FF app reside on different domains.

With the localfs driver, the core app and node-red instances run on different ports. If the instance is being accessed via an external IP address (ie not localhost), then the different port numbers mean they are treated as different domains. So I'm pretty certain that will be impacted.

knolleary avatar May 10 '24 14:05 knolleary

Port-wise, we're safe, the documentation for CSP headers frame-ancestor host-sources directive allows the use of wildcards.

If we're successful in resolving the cookie issue, the next step would be to set the CSP headers automatically from the runtimeSettings in nr-launcher and that should cover the scenario for the port differences.

Unfortunately it still won't work without Https because the cookie will not be set properly.

cstns avatar May 10 '24 14:05 cstns

POC of cross domain setup with valid tls certificates

https://github.com/FlowFuse/flowfuse/assets/20094749/106cb04d-428b-4222-88c8-8fc6f908c0f4

Changes required are as follows:

  • @flowfuse
    • we need to allow the app to embed itself for auth redirects
    • setting Content-Security-Policy header with "frame-ancestors 'self' *.main-ff-domain:*" *.main-ff-other domain:*" directive
      • the self directive would suffice in this case, but we can also add a custom list of domains that accept wildcards for domains/subdomains/ports
      • this directive determines what domains can embed the FF app
  • @nr-launcher
    • we need to configure NR instances by adding a global CSP header to determine a list of domains which can embed the NR editor
    • setting Content-Security-Policy header with "frame-ancestors 'self' *.main-ff-domain:*" *.main-ff-other domain:*" directive
    • these headers should be determined programmatically as opposed to the flowfuse repo as domains might differ
    • the self directive should not be required here, but left it as is
  • @node-red
    • we would need to add additional cookie attributes on the oauth callback method in @node-red/packages/node_modules/@node-red/editor-api/lib/auth/index.js:167
    • ```js
        cookie: {
          sameSite: 'None',
          secure: true,
          partitioned: true, // not available yet
        },
      ```
      
    • change needs to be added in lib/runtimeSettings.js:327

Cookie CHIPS

On a different note, Google is apparently in the middle of phasing out 3rd party cookies:

Google's decision to phase out the third-party cookie in Chrome was announced in 2020 and has been postponed a couple of times. Then as of May 2023, Google declared that they had reached a point of no return and would begin the process in January 2024

Luckily, they postponed it yet again to Q4 in 2024.

The way they're going forward is with 3rd Party Cookie CHIPS (Cookies Having Independent Partitioned State ), which is currently only available in chrome an is more or less namespace-ing for cookies.

This entire shift to cookie with CHIPS is so new that is causing problems everywhere, and, in my opinion Google will just postpone their phasing out yet again. There are a lot of articles online on this matter.

Long story short, we need to set the partitioned attribute on the express session in NR as well, but can't just yet. There's an open issue on the expressjs https://github.com/expressjs/express/issues/5275 which sums everything up nicely.

Express uses the cookie.js lib for cookie management, but cookie.js doesn't support the partitioned cookie attribute only from v0.6.0 onward, a version that's not currently bundled with express. - a straightforward fix would be https://github.com/expressjs/express/issues/5275#issuecomment-1851733379 in the same issue. - another fix would be to avoid the partitioned attribute altogether and handle it when time comes as will everybody else

As an aside, the changes Google will be implementing will impact all cross domain interactions moving forward, not only this specific scenario

N.B. Google's 3rd party deprecation info

cstns avatar May 23 '24 13:05 cstns

With the addition of https://github.com/node-red/node-red/pull/4718 to NR, we will be able to configure session cookie arguments via the nr-launcher.

Once NR v4.0 is released, the immersive editor will become usable. I added a semver check on the NR version (>=4.0) and nr-launcher version (>=2.4.1) for the editor button to point to the immersive view.

One thing to note, due to the fact that the flowforge.yml content_security_policy directive is disabled by default, these changes will impact localfs driver users that have their setup configured in a cross domain manner and don't enable the content_security_policy option. These changes will not have any effect for local development while both NR & FF domains are identical.

cstns avatar Jun 03 '24 14:06 cstns

Reopening to document the current blocker on the immersive editor and NR 4.0.

We need to update our ingress configuration to ensure the x-forwarded-proto header correctly identifies the original request as https by the time it reaches Node-RED. Without that, the session cookie will not get set when marked secure as the request will not pass the 'is secure' check done by express-session.

@ppawlowski @hardillb please update with details and plan for resolving

@cstns you have mentioned another CSP issue related to images. I'm not clear on the details - can you summarize so we can get it sorted?

knolleary avatar Jun 19 '24 11:06 knolleary

I remember seeing a CSP error regarding the worker-src/img-src/font-src/style-src directives failing. Can't remember which at the moment, can't find the thread where I saw it and neither can I replicate it in prod.

What I wanted in the end is to raise a point that we might add the runtime config domain to these directives as well

cstns avatar Jun 19 '24 11:06 cstns

Once https://github.com/FlowFuse/CloudProject/pull/440 is resolved/merged, would this then be good to go (in theory?)

joepavitt avatar Jun 19 '24 11:06 joepavitt

I have enabled proxy protocol between load balancer and ingress nginx on staging . An example of headers passed to the backend service: https://forge.flowfuse.dev/headerstest (manually created debugging service, I will remove it afterwards). Since this change causes short downtime, please confirm it solved the issue before I will apply it on production.

ppawlowski avatar Jun 19 '24 12:06 ppawlowski

@knolleary can you verify please?

joepavitt avatar Jun 19 '24 12:06 joepavitt

I will doing some testing on staging to verify.

My main concern with this whole feature at this point is that if it requires configuration changes outside of the helm setup, then how do self-hosted customers handle it? What documentation do we need to have in place as part of the upgrade guide?

Do we need a way to disable the feature via config in case a customer has a setup that breaks it somehow? Otherwise they will be locked out of the editor.

knolleary avatar Jun 19 '24 12:06 knolleary

@ppawlowski sorry, should have tagged you into the above comment on documentation for self-hosted users.

knolleary avatar Jun 19 '24 12:06 knolleary

if it requires configuration changes outside of the helm setup,

Can you expand on what these are please? I suspect in most cases the NR instances and the FF instance would be at the same domain (as per our staging), FF Cloud's setup I would expect to be fairly unique, but I may be wrong

joepavitt avatar Jun 19 '24 12:06 joepavitt

@joepavitt the PR you linked to: https://github.com/FlowFuse/CloudProject/pull/440

knolleary avatar Jun 19 '24 12:06 knolleary

and we'll have no way of knowing, within the scope of FF whether or not that's been correctly configured?

joepavitt avatar Jun 19 '24 12:06 joepavitt

Correct. We have to provide the cookie options statically to Node-RED before we ever see a request come in which would tell us if it was using http or https.

knolleary avatar Jun 19 '24 12:06 knolleary

So, our options are:

  1. On by default for all FF - which will break for all self-hosted instances
  2. Configuration Option - toggled on, it'll run, but requires pre-configuration meaning barely anyone outside of FF Cloud will use it
  3. We don't deliver at all

We're already constrained by NR4.0+ and a min. nr-launcher version too, right?

joepavitt avatar Jun 19 '24 12:06 joepavitt

Changes introduced in https://github.com/FlowFuse/CloudProject/pull/440 are scoped to the Nginx Ingress Controller. We are not installing any ingress controller as a part of our Helm chart. However, we do require an Nginx Ingress Controller installed for a self-hosted solution.

In my opinion, we should:

  • make the option configurable in the app or make an explicit requirement for the self-hosted infrastructure, that if it is running behind any kind of load balancer, headers should be passed correctly to the backend (especially x-forwarded-proto header)
  • update our documentation regarding ingress controller here , here and here

ppawlowski avatar Jun 19 '24 12:06 ppawlowski

  1. On by default for all FF - which will break for all self-hosted instances

It would only affect self-hosted instances where they have different domains for their FF app & instances

cstns avatar Jun 19 '24 13:06 cstns

I can confirm on staging the x-forwarded- headers are now being passed through. However something is still broken in the authentication when we enable secure cookies. I will need to debug further to see where it's falling down.

knolleary avatar Jun 19 '24 13:06 knolleary