redash icon indicating copy to clipboard operation
redash copied to clipboard

Complete redo of permissions in Redash

Open arikfr opened this issue 5 years ago • 37 comments

This is a first draft to describe this project. We invite interested parties to share their thoughts. As we progress with better planning this, I will update the issue description and add relevant sub issues for the steps.


Up until now, access to data in Redash was governed by access to the data source which the data came from. It made sense as a good default, especially if we want to promote data democracy. However, it introduces many challenges in less trivial use cases, for example:

  • You want to share high level revenues KPIs dashboard with everyone, but give access to the raw data only to a limited group of people.
  • You want to share a few dashboards with your investors without giving them access to all your data.

Today, you can share a single query using the embed link or a whole dashboard using the share link. This is currently limited only to queries/dashboards without parameters, as the latter requires access to run any query on the underlying data source.

The goal of this project is to redo our permissions model and introduce safety to parameters execution, to support sharing a query or dashboard with anyone users choose - internal or external.

This is a huge change in how Redash works and will be implemented in steps:

  1. Add support for "feature detection", to allow for the application to decide whether a specific object can be shared or not. We will start with disabling current sharing options (embed/shared link) when the object uses parameters.
  2. Add support for safe execution of parameters: #2904
  3. Add query results API that is referenced through the query itself.
  4. Add support for sharing queries/dashboards with other users, through the current permissions dialog.
  5. Add support for a public link/access to an object, which will replace the current share/embed features in queries/dashboards.
  6. Add an API to create shareable links for objects with some parameters predefined (to allow Embedded Analytics use cases).

Some things to consider:

  • What happens with an object that was shared already when you add a parameter?
  • Add a concept of workspaces/teams instead of groups? So anything you create in this workspace has some default permissions? Relevant: #2397.
  • Until now you had to create data sources of type Query Results, URL, CSV, etc for to purpose of managing permissions. But once we have permissions decoupled from data sources, we can have all these data source types as built in types, so you can use them to load data into Redash and then decide who you share it with. (related discussion)
  • #1909 is somewhat related here.

arikfr avatar Jan 15 '19 10:01 arikfr

I am interested. I have a problem with access to dashboards for viewing. I would like to have access to such dashboards and to such requests only from the assigned group of users.

I propose to introduce collections which may include dashboards, queries, tags, etc. These collections can be assigned to groups of users who see only them and nothing else.

ice2038 avatar Jan 15 '19 13:01 ice2038

Great! Today I need to publish some results/insights for specific users, who should not access the queries or data sources. Will it be possible to create groups with more restrictive permissions? What you think about granularities of roles / permissions? Tks!

bnopacheco avatar Jan 16 '19 03:01 bnopacheco

Version 1. Access-Version-1

ice2038 avatar Jan 17 '19 08:01 ice2038

Thanks, @ice2038. I'm planning to do a deeper dive into the planning of this in the coming days, and I will take into account your suggestion. I will post here follow ups.

@pachecobruno if I understand your use cases correctly, they will be addressed by the work on this.

arikfr avatar Jan 20 '19 09:01 arikfr

Hi, do you have any ETA when this fixes an issue visualization with parameters could not be shared #2377 ?

P.S. Just in case here is example of this error https://redash-stage.mrshoebox.com/embed/query/3/visualization/7?api_key=0DgRNn12On9zM4MVkLXmxiAUsQ0tDhnHFlbsjnyl

and same visualization, but parameter value is hard coded in query: https://redash-stage.mrshoebox.com/embed/query/5/visualization/10?api_key=9kVjy030spYDgL1Eh77f1ckfr2h0TSK2FR1FF3wh

rredkovich avatar Feb 06 '19 09:02 rredkovich

@rredkovich this is something we're working on at the moment (have a look at the parameter safety PRs) so we hope it won't be too long before you could share visualizations with parameters.

rauchy avatar Feb 06 '19 12:02 rauchy

+1 For being able to use query_ and cached_query_ when the underlying query is parameterized.

Also, regarding permissions (aka policy management, authorizations):

I've recently being hunting around to find the best free / open source solutions, with a particular focus on Python implementations, but ones that support multiple langauges, platforms, etc. What I found to be the most promising options were implemented in Go (except one), which I suppose makes sense.

Open Policy Agent seems to be coming into its own and has an impressive list of adopters. Authorizations happen over REST. Policies are written in Rego. OPA is an project supported by the Cloud Native Computing Foundation (CNCF).

Casbin appears to be another good option. Instead of authorizing over REST, Casbin provides an impressive list of SDKs for various languages. It also support numereous policy models right out of the box (numerous variants of ACL, RBAC, plus ABAC, REST routes, Deny-Override, and Priority).

The OPA documentation explains how several of these can be implemented in Rego. My sense is that both systems are flexible and well thought out, with Casbin supporting numerous traditional and well-defined policy models, whereas OPA with Rego seems a bit more focused on providing a language and framework for authoring whatever policy model one wishes. Think Django vs. Pyramid.

Both have online editors (OPA, Casbin) where one can explore how policies are written and operate.

ORY Keto is another interesting option, and is a part of the ORY ecosystem. How authorizations take place is still not clear to me, but they only list a REST API for Keto, (whereas they list both REST and SDK options for Hydra, a related component in their ecosystem).

Finally, if one wanted a solution primarly (only?) for Python, ziggurat_foundations is often mentioned and looks quite nice. It provides mixins for SQLAlchemy classes, which I find appealing, and I suspect that makes dealing with policies seem natural, pythonic, and easy.

Perhaps Redash could use one of these, or something similar, to more quickly, flexibly and thoroughly re-implement its permission management. Even if it's decided that implementing from scratch is the preferred option, I think it'd be great if regardless an eye was kept toward either having, or being in a good position, to offer plugins in Redash that could authorize requests against these other solutions as well.

morsedl avatar May 10 '19 21:05 morsedl

As a data point, it would be useful for Redash to have a simple "This specific dashboard (eg one including salary details) should only be accessible to person A, B, and C".

It's possible to achieve this result (eventually) with the current approach through careful group planning.

But that's a problem when a customer has an extensive set of existing queries that would need re-doing using a different grouping model, just for one (sensitive) dashboard.

justinclift avatar Jul 03 '19 14:07 justinclift

Hi, I'm the author of Casbin. You can consider using Casbin, as it supports 8 languages, including Python, Go and Javascript (Node.js). It supports RBAC with domains/tenants model (can be used to build a AWS cloud or GitHub).

As for PyCasbin (Python's implementation), it supports storing policy rules into databases via SQLAlchemy or Peewee, see the adapters. You don't need to handle the storage manually.

It's easy to use, as you can debug your model and policy setting in the online editor before putting into production. It also supports distributed policy enforcement if you need it.

Let me know if you have any questions :)

hsluoyz avatar Jul 03 '19 15:07 hsluoyz

I want to know this issue processing. Anyone know?

piperck avatar Jul 17 '19 03:07 piperck

I really like @ice2038's proposal, it would be very useful in our use case.

It would be also nice for our us to separate edit permissions in query edit and results update permissions, so that a user could refresh existing queries but not changing the underlying code nor creating new ones.

gseva avatar Aug 13 '19 21:08 gseva

What's the status of this issue? I'm waiting to implement redash for our organization but need this enhancement (similar to what @ice2038 proposed) first.

lsmoker avatar Oct 30 '19 14:10 lsmoker

+1

simzen85 avatar Nov 19 '19 10:11 simzen85

This would be hugely valuable. Even just simple dashboard-level permissions would help a ton so that certain dashboards can be restricted to certain groups of users.

daniellangnet avatar Jan 02 '20 02:01 daniellangnet

+1

qmgeng avatar Jan 07 '20 02:01 qmgeng

+1

Vitorjardim avatar Jan 07 '20 16:01 Vitorjardim

@arikfr Any update with the progress of this issue? We're looking forward to the new & improved Redash user access controls / permissions.

gilbzs avatar Jan 20 '20 04:01 gilbzs

+1

TimothyZhang avatar Feb 11 '20 04:02 TimothyZhang

+1

mikkojirnexu avatar Feb 28 '20 09:02 mikkojirnexu

Like @gseva we came across is the need to separate edit permissions from refresh permissions. We need to be able to restrict some users from refreshing (slow) queries so they are forced to use the scheduled results. We also need a separate permission to let some users use dangerous parameters without giving them full edit permission. Obviously this has security issues but we can decide internally on a per user basis whether they are likely to or even have the knowledge to abuse it.

lolaslade avatar Feb 28 '20 16:02 lolaslade

What is the status of this?

Jensen3131 avatar May 14 '20 08:05 Jensen3131

We also very much need this as we have sensitive data that can not be shared across departments. For most of our reports, we want to only show totals without revealing the actual data. For example, we want to show a total of the number of email addresses without revealing the underlying email addresses. This feature will make this easy rather than having to think about provisioning various tables/views and access points to the database.

In short, we want various users to be able to see the dashboard they are assigned, without the ability to access the underlying data.

corkub avatar Aug 04 '20 01:08 corkub

Since I'm using both Redash and Power BI and other BI tools in different projects, as well as I'm a product owner of a tech company, I think we should look at the process and take care of permission in each step. Please look at my proposal below.

redash_permissions_idea

About data structure, I propose the following structure. This is only one version of permission models. I'm neutral with the inverse models (permission on objects attach to user/role instead of attaching to object).

redash_permission_objects

There will be a special role called "anonymous", to enable public, not logged-in users to view the dashboard.

There is a very good feature presented in Power BI that I borrowed here, is Row Level Security. If you cached your query and always visualize data based on cached query result (dataset), you can use your native Redash query to set a pre-filter to your dataset, to narrow down the data each user can acceess. By using this, you can overcome the fact that not every database supports native RLS in its engine.

Also in the topic of pre-defined permission, we can create permission_preset object to contain the list of roles or permissions. By doing this, we can also create custom role by assigning individual permission to that role. One thing I learned from other big guys is inherent, where we can create sub-permission that inherit parent's permission and contains other specific permissions.

chulucninh09 avatar Aug 05 '20 08:08 chulucninh09

Hi @arikfr We depend a lot on fixing of this View-only users cannot execute queries with parameters #1163. But seems like the task is not in the focus. Is any chance, that work on it will start in near future?

bodnari avatar Mar 23 '21 10:03 bodnari

Is this still being worked on? kind of looks dead..

jitendra-koodo avatar Dec 15 '21 18:12 jitendra-koodo

Not at all dead @jitendra-koodo, though it has been dormant for some time. I'm beginning the work on this project within the next week.

susodapop avatar Dec 15 '21 22:12 susodapop

@susodapop really? Redash not being dead would be amazing news. It's been very quiet these past 12+ months

daniellangnet avatar Dec 15 '21 22:12 daniellangnet

@daniellangnet not at all dead. V10.1 just released a couple weeks ago and we're planning out the big features for V11 now. Thanks for keeping an eye here. I know it was pretty quiet for awhile. 🙏

susodapop avatar Dec 15 '21 22:12 susodapop

@susodapop very glad to hear! Redash is still IMO the best SQL-focused BI tool there is. I've already submitted a couple of PRs (the custom viz has been reviewed & approved but never merged, so I assumed the open source part of this project is on ice). Let me know if there's any way we can be a supporter now that you're part of Databricks

daniellangnet avatar Dec 15 '21 23:12 daniellangnet

so I assumed the open source part of this project is on ice)

@daniellangnet I'm sorry this has been such a reasonable assumption in recent months. It comes down to limited resources while winding down the hosted offering. But we're back on at full steam for OSS development. Your PR is on my list to review (along with another ten in the next week or so). And of course we're setting up plans for larger projects (like permissions) in the new year.

susodapop avatar Dec 16 '21 01:12 susodapop