tutor icon indicating copy to clipboard operation
tutor copied to clipboard

Preview mode breaks in Maple if PREVIEW_LMS_HOST is not a subdomain of LMS_HOST

Open fghaas opened this issue 3 years ago • 12 comments

Bug description

As of Maple, if PREVIEW_LMS_HOST is set to something other than a subdomain of LMS_HOST, preview mode breaks since it unexpectedly prompts the user for authentication.

How to reproduce

  • Deploy a Tutor environment on Maple, but instead of following the standard naming naming convention of preview.lms.example.com, set PREVIEW_LMS_HOST to something like lms-preview.example.com.
  • Open a course in Studio, such as the demo course.
  • Open a unit in that course and click Preview.
  • Observe that you see an authentication prompt.

Environment Tutor 13.0.2.

Additional context In Maple, as explained in the release notes, there's no Preview for the Learning MFE yet, and previews render in the legacy LMS.

However, we can also disable the Learning MFE, at which point we get the legacy learning experience throughout. This, I suppose, might be something that quite a few sites will want to opt to use, until they can offer a smooth course authoring experience where preview mode and the published course do look the same again.

Now, if I open a course unit in Studio and then hit the Preview button, I am prompted to authenticate. I don't think that that should be happening. (I should add that authentication also does not work in that scenario, so preview mode effectively isn't functional at all.)

I am guessing (please correct me if I'm wrong here) that unlike Studio which now uses OAuth2 authentication, preview mode is still relying on a shared session cookie. And in 7c157eccd5f0dba69b023e3dd64ed12ce6cbeaa4, Tutor followed the recommendation from the docs and changed the value of SESSION_COOKIE_DOMAIN from the domain shared between LMS_HOST and CMS_HOST, to just LMS_HOST.

That happens to work fine for preview mode as well, as long as PREVIEW_LMS_HOST is indeed a subdomain of LMS_HOST. When it isn't, however, things break.

I am guessing that instead of bringing back

"SESSION_COOKIE_DOMAIN": ".{{ LMS_HOST|common_domain(CMS_HOST) }}"

(which templates/apps/openedx/config/lms.env.json and cms.env.json were using prior to 7c157eccd5f0dba69b023e3dd64ed12ce6cbeaa4), we could instead use

"SESSION_COOKIE_DOMAIN": ".{{ LMS_HOST|common_domain(PREVIEW_LMS_HOST) }}"

... but this will of course mean that the session cookie will be exposed to "a potentially large number of subdomains", which is exactly what the Studio OAuth2 change is aiming to avoid. So I'm not sure if that's a good way forward.

@regisb, what are your thoughts on this? How wrong am I? :)

fghaas avatar Jan 04 '22 15:01 fghaas

Hey Florian, sorry about the delay replying to your issue.

I believe that this should be triaged as an upstream bug. In a nutshell, and to paraphrase your description: users should not have to re-login to access the preview site, even when the preview site domain is not a subdomain of the LMS.

I am not a big fan of modifying the session cookie domain just to address this issue. I believe that a proper resolution should include oAuth authentication between the preview site and the LMS.

Let's pull @timmc-edx into the conversation here, since he is the one who helped me resolved this other issue.

regisb avatar Jan 19 '22 10:01 regisb

Oof, right. We actually got bit by this on edx.org, and what we ended up doing was moving the preview site (preview.edx.org) to be under the LMS (preview.courses.edx.org). I guess we then missed the step of adding to the release notes. My apologies!

Setting the session cookie domain to be broader would certainly work, although disrecommended for the noted reasons. Moving the preview domain would probably be the fastest fix, if acceptable.

The idea of having the preview site be an OAuth client of the LMS is interesting, although I don't know if it would work. My understanding is that the preview site is the LMS, just addressed by a different domain, so I don't know if trying it would break the regular LMS.

Some previous internal discussion...

Kyle:

There is another more radical potential solution: Get rid of the preview. domain differentiation in favor of appending something like ?course_preview=1 to trigger the preview view. There’d be some cascading change associated with it, namely, ensuring that the new query param is passed between all the interested views, and passing a preview_enabled boolean down to all the functions that currently look at directly at the request hostname. But I think, in the end, it be a saner system.

I realize the scope of that ^ is bigger than we’re hoping the preview.courses.edx.org domain change will be. But just something to keep in mind if it seems like the domain change scope is ballooning and there doesn’t seem like there’s an easy way to keep both certs and cookies happy at the same time.

(It appears there was some temporary issue with certs in the move, which may or may not be a stumbling block for other deployments. I haven't looked at the details.)

Dave:

Yeah, the painful thing here is that it’s the whole experience, including things like user masquerading, checking problem answers, etc. We’d have to make sure everything is consistent across the board in handling that parameter, or we’ll mix published and preview experiences.

timmc-edx avatar Jan 19 '22 15:01 timmc-edx

Hmm... @kdmccormick @ormsbee how hard would it actually be to make this change? It looks like everything reads PREVIEW_LMS_BASE via in_preview_mode on the LMS and generates URLs via get_lms_link_for_item on the CMS. It seems like the hard part here (besides updating unit tests and such) would be figuring out how to convey a new preview_enabled course parameter while navigating in preview mode. Does that sound right?

timmc-edx avatar Jan 19 '22 18:01 timmc-edx

@timmc-edx Welp, edx-platform already shamelessly uses crum to treat the current request as a global variable, so I imagine we could backwards-compatibly extend in_preview_mode to something like:

import crum
...
def in_preview_mode():
    """
    Returns whether the user is in preview mode or not.
    """
    hostname = get_current_request_hostname()
    preview_lms_base = settings.FEATURES.get('PREVIEW_LMS_BASE', None)
    if hostname and preview_lms_base:
        if hostname.split(':')[0] == preview_lms_base.split(':')[0]:
            return True  # DEPRECATED: We're in preview mode, per the hostname.
    request = crum.get_current_request()
    if request:
        if request.GET.get('course_preview') == '1'
            return True  # We're in preview mode, per the query param.
    return False  # Not in preview mode.

allowing use of either the preview domain or a query param.

kdmccormick avatar Jan 19 '22 18:01 kdmccormick

It seems like the hard part here (besides updating unit tests and such) would be figuring out how to convey a new preview_enabled course parameter while navigating in preview mode. Does that sound right?

@timmc-edx I just realized what you meant by that -- that hard part would be making sure that we each preview-mode courseware page links out to other preview-mode courseware pages, not published-mode courseware pages. Yeah, I agree; that does sound like the hard part.

kdmccormick avatar Jan 19 '22 19:01 kdmccormick

The preview URL is not something that people typically make permanent links/bookmarks to, so I would advocate for moving it to a sub-domain as was done in the edX case as the lowest risk change.

I echo your concerns: modifying in_preview_mode to take a querystring param feels like the sort of change that would need to trickle out to a bunch of places and that we'd always be patching things to send that param when we find the gaps. It's not just navigation. For instance, if a problem is created or modified in the draft branch and we're viewing it in preview mode, do we also need to pass that querystring to the XBlock handler endpoint when someone clicks "Submit"?

FWIW, I don't really know much about the state of preview in the LMS right now, and it's possible that it's already broken in this scenario. But at a high level, that's the kind of concern I'd have–changing everything at the domain level at least means that this sort of mode-management is automatically present by default.

ormsbee avatar Jan 19 '22 19:01 ormsbee

@timmc-edx:

Oof, right. We actually got bit by this on edx.org, and what we ended up doing was moving the preview site (preview.edx.org) to be under the LMS (preview.courses.edx.org). I guess we then missed the step of adding to the release notes. My apologies!

I'd like to reiterate that Tutor already does this (in other words, configures the preview domain name correctly by default), and only breaks if the user chooses to override the preview domain so that it is not a subdomain of the LMS, in a departure from Tutor's defaults. If I understand correctly, then preview in Tutor isn't broken out of the gate; it just breaks unexpectedly when the user overrides PREVIEW_LMS_HOST.

I'd posit that if changing to

"SESSION_COOKIE_DOMAIN": ".{{ LMS_HOST|common_domain(PREVIEW_LMS_HOST) }}"

(like I suggested upthread) is considered undesirable, then Tutor could maybe simply flag an error if PREVIEW_LMS_HOST isn't a subdomain of LMS_HOST?

fghaas avatar Jan 19 '22 20:01 fghaas

Tutor could maybe simply flag an error if PREVIEW_LMS_HOST isn't a subdomain of LMS_HOST?

Since I expect it'll be some time before the preview situation is improved (by changing edx-platform to use OAuth, or switching to a querystring param, or whatever), I like @fghaas 's suggestion here to just alert the user when the preview domain isn't a subdomain of LMS's domain.

I don't know whether there's prior art for that sort of thing in Tutor; if not, we could simply raise an ImproperlyConfigured exception from edx-platform's lms/envs/production.py.

kdmccormick avatar Jan 20 '22 02:01 kdmccormick

Another option would be for Tutor to allow overriding SESSION_COOKIE_DOMAIN specifically, so that users who (a) absolutely don't want PREVIEW_LMS_HOST to be a subdomain of LMS_HOST, and (b) understand the risk of setting an overly broad cookie domain, and (c) want to go ahead anyway, could do so.

fghaas avatar Jan 20 '22 08:01 fghaas

Your analysis of the situation is 100% correct @fghaas. I just want to point out that it's possible to override the SESSION_COOKIE_DOMAIN setting with a minimal plugin.

I've been meaning for some time to perform some configuration checks as part of tutor config save. It would definitely make sense to verify that the preview domain name is correctly set.

regisb avatar Jan 20 '22 10:01 regisb

I'm not sure how we're able to override SESSION_COOKIE_DOMAIN using a plugin now, as I don't see a patch for that in the template file.

iamCristYe avatar Jan 22 '22 14:01 iamCristYe

The SESSION_COOKIE_DOMAIN should not be overwritten in the *.env.json files but in the Python settings, with the following "openedx-lms-common-settings" patch:

SESSION_COOKIE_DOMAIN = ".{{ LMS_HOST|common_domain(PREVIEW_LMS_HOST) }}"

regisb avatar Jan 27 '22 09:01 regisb

Hello there, I am coming here, becaues I think this issue is relevant not only for PREVIEW_LMS_HOST but also for mfe MFE_HOST[^1].

I am interested in this because I am builiding a spesfic plugin[^2], of which it would used cloudflare tunnel, the thing is cloudflare has a restriction of it's (proxy/free ssl) featuer that is you need to be on a premium plan so they can cover two level subdomain[^3], i.e. two.one.example.com.

Reading through the comments, it seems that setting the cookie domain to be the parent is discouraged, that being said, is possible to get more input on this?,as I intent to suggest doing that to mitigate the above caveta.

Checking my cookies in dev tools on edx.org, it seems edx.org is setting the domain .edx.org for at least a dozen of cookies including the one refered above[^1]. Secondaly I went through RFC for cookies[^4], And my conclusion is that it's discourged to run two untrusting services on same machine/host, and there always going to be a risk for cookies secuirty if DNS service is untrusted.

In any case I would link to this issue, and mention the cloudflare free plan and Open edX integration as a cavete. And thus leave to the opreator best judement, and of course your input is most welcomed.

[^1]: This because ALL mfes are using JWT cookie for auth service, which is set by the LMS also people have report it in the forum. [^2]: https://github.com/openedx/platform-roadmap/issues/169#issuecomment-1587442906
[^3]: See limitation of unverisal ssl (free plan). https://developers.cloudflare.com/ssl/edge-certificates/universal-ssl/limitations/ [^4]: IETF on Cookies Mainly section 8 https://datatracker.ietf.org/doc/html/rfc6265#section-8

ghassanmas avatar Jun 20 '23 15:06 ghassanmas

Here's what I suggest:

  1. Create a CONFIG_LOADED action in tutor.hooks.catalog.
  2. This action should be called at the end of tutor.config.load_full. It should only be passed deepcopy(config), and not config, such that users cannot modify the full config via this action.
  3. Add a callback that checks whether PREVIEW_HOST.endswith(¨."+ LMS_HOST) (same for MFE_HOST). If not, print a warning.

Who wants to open a PR?

regisb avatar Jun 27 '23 10:06 regisb

If no one has taken charge of this issue, I am fully prepared and capable of making sure it is taken care of.

CodeWithEmad avatar Aug 21 '23 14:08 CodeWithEmad

Please do @CodeWithEmad :)

regisb avatar Aug 22 '23 07:08 regisb

So sorry for the late response.

This action should be called at the end of tutor.config.load_full

@regisb Calling the action at the end of full load is not the best option since it's running twice on tutor config save.

image

once in: https://github.com/overhangio/tutor/blob/00c58e7a9ed143f24e9cac3bbac3287d891d9d3a/tutor/commands/config.py#L150 and again in: https://github.com/overhangio/tutor/blob/00c58e7a9ed143f24e9cac3bbac3287d891d9d3a/tutor/commands/config.py#L186 we can either change where the action is called or modify the commands.config.save function. your call.

also, checking the MFE_HOST being a subdomain of LMS_HOST should occur here or in mfe plugin?

CodeWithEmad avatar Sep 17 '23 11:09 CodeWithEmad

we can either change where the action is called

Yes, can we move it to config.load?

also, checking the MFE_HOST being a subdomain of LMS_HOST should occur here or in mfe plugin?

In the tutor-mfe plugin.

regisb avatar Sep 21 '23 10:09 regisb

can we move it to config.load?

Well, config.load is called in many places, and therefore, on each command like start, stop launch, etc. we get the warning. is this something we want? what about tutor.env.save? this is called mostly on the upgrades and at the end of the config.save command.

There are a couple of other things I'd like to address here:

  • example in the actions.py: https://github.com/overhangio/tutor/blob/c594817743a853a1718c88c728290e709e60752c/tutor/core/hooks/actions.py#L81 shouldn't we pass an int as the priority? something like this: @my_action.add(priority=10)

  • The default LMS_HOST value in the tutor/templates/config/defaults.yml is www.myopenedx.com. I know this will be changed on the first launch, but before that, on the first save, it will make other domains invalid like this: CMS_HOST: studio.www.myopenedx.com. should we consider dropping the www. part?

  • Currently, there is no domain validation check in place. This means that when running the command tutor config save -s LMS_HOST=NotValidAddress, no error or warning is raised. Is this intentional?

CodeWithEmad avatar Sep 23 '23 07:09 CodeWithEmad

Sorry, moving to config.load was a terrible idea. Instead, I suggest the following;

  1. Remove the line config_full = tutor_config.load_full(context.root). This variable is only used with --append.
  2. Modify the following part of the save command:
if append_vars:
    config_defaults = tutor_config.load_defaults() # this new function should be implemented
    for key, value in append_vars:
            if key not in config:
                config[key] = config.get(key, config_defaults.get(key, []))
                ...
  1. Implement the load_defaults function.

Does that make sense?

shouldn't we pass an int as the priority?

The priority argument is optional. In our case I don't think we need to specify one -- but maybe I'm wrong about that? Let's try to implement the MFE feature without the priority arg and see if it works.

should we consider dropping the www. part?

I'd rather avoid that.

Currently, there is no domain validation check in place. (...) Is this intentional?

No. Well, yes it's intentional, but it's not great :) I've been meaning to add a check in the past, but did not because there was no elegant way of doing it. After you've created the new action, we will be able to fix this. In particular, I'd like to avoid IP addresses, "localhost" and "mydomain:" values, which cause considerable confusion among users.

regisb avatar Sep 25 '23 09:09 regisb

I might be a little off on this one. let's discuss more on the PR.

CodeWithEmad avatar Sep 25 '23 12:09 CodeWithEmad