tobira icon indicating copy to clipboard operation
tobira copied to clipboard

Think about authorization system and ways to abide by rules of external systems (e.g. LMS and course membership)

Open LukasKalbertodt opened this issue 9 months ago • 1 comments

It is very common for a university using Tobira to also have an LMS. In many cases, the LMS stores data like course membership (what student is signed up for which courses and what permissions they have within that course). The visibility of videos is tied to that data, e.g. some videos are only visible to users in that course. There are sometimes other factors deciding visibility, like scheduled videos (only being visible in a specific time period). Or videos that can only be seen after answering some quiz questions.

The LMS can of course perform arbitrary logic to decide whether a user is allowed to see a video. How to make the Opencast file-serving abide to that decision is currently discussed here. And of course, we can also talk about arbitrary external systems, not just LMS.

A common request is that Tobira copies that exact authorization scheme, i.e.: users can access a video if and only if the LMS would also have allowed that user to access it. Doing that is far from trivial!

What follows is some abstract rambling.


Encoding authorization logic in systems

As documented, Tobira does authorization exactly as Opencast does it: to check whether user U can perform action A on element E, we check whether the ACL for action A on element E overlaps the roles of user U (i.e. if there are any roles in common). Let's call this system the "role-overlap-system". The ACLs come from Opencast and the user roles from the Tobira auth integration (that the university provides).

Let's take a somewhat simple example: visibility based on course membership. For example, Stud.IP modifies Opencast events to add roles such as ROLE_COURSE_123_Learner for videos that all students from course 123 are allowed to watch. ROLE_COURSE_123_Instructor for those that only instructors of that course are allowed to watch. It does that in order to use LTI (setting the correct user roles) and let Opencast do the authorization. In case of Tobira, the auth integration could ask Stud.IP what courses a user is part of, and then pass the corresponding ROLE_COURSE* roles to Tobira. Tobira then just does the overlap check and everything works.

Note what we did here: we encoded the "course membership" logic inside the role-overlap-system. But of course that system is somewhat limited and and not all possible logic can be represented by it. The example "visible after answering a quiz question" probably also counts as encodable, as answering the question could just give the user an additional role ROLE_CLEVER_GIRL and the video could just have that role in its ACL.

But the "only make video visible in time period" example: the role-overlap-system is not powerful enough to encode that logic. Well, you might say: some system can just add/remove roles to the video's ACL at fixed points in time. Sure, there is sync delay here, but that's also the case for "course membership" and "question answered". However, the timing example is still distinct IMO:

  • It's truly a new factor at play here. The passing of time does not intrinsically change anything about user roles nor about the event.
  • This is also shown by the fact that that external system needs to remember all roles it removed, in order to re-add them to the ACL later again.
  • Finally, if the external system goes down, this breaks down. In the course-membership and question-answered examples, if the external system LMS goes down, the course-membership and question-answered state cannot change! Thus, Tobira just evaluating its role-overlap-system is still completely correct.

If you'd argue that the timing-example is encodable in the role-overlap-system, then in fact everything is. Because your external system can just evaluate arbitrary logic, give every user just a single unique role, and populate videos' ACLs by all user roles that are allowed to see it. And do that every second to also cover the time aspect.

Why Tobira need's a system that can be evaluated locally

Being disappointed in the limitations of Tobira's system, one might say: why doesn't Tobira ask the external system every time it needs to answer an authorization question? Via API for example? The LMS would get the user and video ID and just replies "yes" or "no".

Because of speed essentially. Tobira needs to answer authorization questions a lot. Just visit its start page and for every video in every series on that page, Tobira needs to know whether the current user is allowed to see it. Asking the LMS every time (even using batching) would cause lots of stress on the LMS system and would notably increase delay and decrease responsiveness of Tobira. But it gets worse! Two examples:

  • When searching in Tobira, we use MeiliSearch. Of course, only search results that the user is allowed to see are shown. So we again, need to evaluate authorization for lots of videos, on every key stroke. But worse: we really want Meili to perform the authorization check for us! Our search page wants to show N items, so we request that many items from Meili. If we then filter for visibility in our backend code, we might end up with fewer than N items. So we have to request more from Meili again, potentially many many times.
  • The same with "my videos": a paginated long list of all videos that the user has write access to. Here, too, we don't want to evaluate the visibility in the backend code, but in the database directly.

In both cases we manage to make Meili/Postgres perform the check for us by using filter functionality they offer. In both cases it is absolutely infeasible to contact an external service to answer the authorization question. Also note that we are limited by what Postgres/Meili can do! Postgres is pretty powerful but for Meili, we are already constructing a query with as many "OR" terms as user roles, which seems a bit hacky. Some logic is just not expressible in Meilis query language. Technically we are not restricted by Meili/Postgres as we can still perform the check in the backend if it's fast enough. That would still be undesirable and slower due to the pagination issues mentioned.

Also remember that from the very beginning, a requirement of Tobira was to keep working even when all other system (except the file server) are down.

This should also make it clear why the solution we will come up with here won't work for Tobira.

So for Tobira to work fast and well, we need to be able to encode the authorization logic in a system that can ideally be evaluated by Meili and Postgres. Or at the very least locally in the backend. Further, it's probably useful to demand that the visibility question can be answered by only: data associated with user in question, data associated with the resource (e.g. video) in question, and some global factors. I.e. it does not depend on other resources or other users. Speaking very loosely, we want to only store O(|users| + |resources|) and not O(|users| * |resources|).

Use a more powerful system?

As said above, we are somewhat limited by Postgres and Meili. But we could still offer a more powerful system, e.g. by adding the time component. We could just decide a particular metadata field on events can be used to specify time periods in which the video is visible. While that would make our code more complicated, it's still possible to implement that. (Meili has no now(), but calling now() in backend code and passing it to Meili is absolutely good enough.)

Ideally (IMO), such a change should also happen in Opencast then, to keep Tobira's and Opencast's system the same.

Whether we should make the system more powerful depends on how many people are interested in it. At least the ETH mentioned time-based visibility a few times.

LukasKalbertodt avatar Oct 12 '23 17:10 LukasKalbertodt