superset icon indicating copy to clipboard operation
superset copied to clipboard

[SIP] API Row Level Security: get rls by username or roles via API (tested on Apache Superset 2.1.0)

Open frlm opened this issue 2 years ago • 5 comments

SUMMARY

I have created the following SIP to give the client-side possibility to generate an embedding request by retrieving the row level securities defined on the UI page:

image

Providing either the username of the user or the role already defined on Apache Superset, API returns the list of RLS to be provided to body of endpoint /api/v1/security/guest_token/:

image

Otherwise, the client making the request must know in advance the settings provided at the user interface level. The following endpoints generate output considering the filter type (Basic / Regular) and the presence or absence of the Group Key. Right now a user without providing an rls but only the default value [ ], can display all the data within the dashboard, in my case I needed to manage the display of the type of data with respect to the user requesting the embedding. The output of request will be list of dictionaries with "clause" and "dataset" as keys, e.g:

[{"clause":"(province = 'AG')","dataset":25},{"clause":"(gender = 'boy')","dataset":2}]

If the different rls filters associated with the requested user or role have the same group key, the filters associated with the same dataset will be merged using the OR condition, e.g:

[{"clause":"(province = 'AG') OR (province = 'SI')","dataset":25}]

Proposed Change

image

I created new two endpoints:

  • /api/v1/security/get_rls_by_username/
  • /api/v1/security/get_rls_by_role/

that retrieve users, roles, tables and rls from Metadata tables (using the SQLAlchemy package), convert all filters of type Base to Regular and aggregate the filters with respect to the defined group key value. Obviously considering only the row level securities enabled on that specific user or role.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

BEFORE: no endpoints to retrieve RLS configs made by Row Level security settings page

AFTER: image

TESTING INSTRUCTIONS

Create an user with a specific role and related RLS, check if endpoints return correct output.

ADDITIONAL INFORMATION

  • [ ] Has associated issue:
  • [ ] Required feature flags:
  • [ ] Changes UI
  • [ ] Includes DB Migration (follow approval process in SIP-59)
    • [ ] Migration is atomic, supports rollback & is backwards-compatible
    • [ ] Confirm DB migration upgrade and downgrade tested
    • [ ] Runtime estimates and downtime expectations provided
  • [x] Introduces new feature or API
  • [ ] Removes existing feature or API

frlm avatar Sep 20 '23 18:09 frlm

Thank you for the PR @frlm. I'm not sure if you are aware but we have the following RLS endpoints. Are they not sufficient for your use case?

Screenshot 2023-09-20 at 17 27 17

michael-s-molina avatar Sep 20 '23 20:09 michael-s-molina

Thank you for the PR @frlm. I'm not sure if you are aware but we have the following RLS endpoints. Are they not sufficient for your use case?

Screenshot 2023-09-20 at 17 27 17

Hi Michael, thank you for your interest in my SIP request, I know there are new endpoints to pull RLS information via API, I had read them before implementing the two new endpoints included in this pull request. You will notice that I did not include them within the RowLevelSecurity class because they were not yet present in Apache Superset version 2.1.0. I needed this functionality in the version I am using in a specific production environment. However to explain the reasons why I think it is necessary to add these endpoints I will use /api/v1/rowlevelsecurity as an example. In particular if we read the json argument:

image

"result: [
    {
      "changed_on_delta_humanized": "string",
      "clause": "string",
      "description": "string",
      "filter_type": "regular",
      "group_key": "string",
      "id": "0",
      "name": "string",
      "roles": [
        {
          "id": "0",
          "name": "string"
        }
      ],
      "tables": [
        {
          "id": 0,
          "schema": "string",
          "table_name": "string"
        }
      ]

we can observe that there is all the information present in the RLS table plus the associated roles but during the embedding requests I observed the following shortcomings:

  • the endpoints already present allow you to extract information from the RLS table but do not help you extract the RLS for that specific username or role, generally the Client does not know which role the user is associated with.

  • If for that specific user or role there are several RLS filters of the Regular / Base type and with different group key values, it is necessary for the Client to carry out all the logical aggregation operations (AND, OR) to generate the list of dictionaries used later in the endpoint body /api/v1/security/guest_token/.


{
  "resources": [
    {
      "id": "string",
      "type": "dashboard"
    }
  ],
  "rls": [
    {
      "clause": "string",
      "dataset": 0
    }
  ],
  "user": {
    "first_name": "string",
    "last_name": "string",
    "username": "string"
  }
}

This leads to greater difficulty on the Client side in using the tool via API because it must understand and manage any updates on the Superset side on how the settings relating to Row Level Security are managed.

frlm avatar Sep 21 '23 08:09 frlm

You will notice that I did not include them within the RowLevelSecurity class because they were not yet present in Apache Superset version 2.1.0. I needed this functionality in the version I am using in a specific production environment.

In scenarios like this, the best approach is to contribute to the master version, to ensure the contribution will be available in subsequent releases, and wait for a minor version of your current release that includes the new feature. An alternative is to just implement this feature locally for your deployment but you'll need to support it when upgrading.

the endpoints already present allow you to extract information from the RLS table but do not help you extract the RLS for that specific username or role, generally the Client does not know which role the user is associated with.

If for that specific user or role there are several RLS filters of the Regular / Base type and with different group key values, it is necessary for the Client to carry out all the logical aggregation operations (AND, OR) to generate the list of dictionaries used later in the endpoint body /api/v1/security/guest_token/.

I'm pretty sure we can resolve these use cases by a combination of filters (q request parameter) and token augmentation if the required information is not there already. I'll ping some folks that know more about the embedded feature and can help you. @rusackas @lilykuang @betodealmeida @yousoph

michael-s-molina avatar Sep 21 '23 11:09 michael-s-molina

You are right, I should have released on the master branch. However, I only added two methods within the file /superset/security/api.py; there are no other changes. I await your feedback, the important thing is that filters of type Basic are transformed into Regular (otherwise they cannot be used) and also that OR aggregations are performed between different RLSs if the group key is the same.

frlm avatar Sep 21 '23 13:09 frlm

@frim this seems to have slid under the radar for quite some time. Sorry about that. If you want to rebase the PR, we'll see how it works in current Superset, and review it. Hopefully we can get this across the finish line, but in the meantime, I'll set this to Draft mode while it awaits rebase. Mark it as open for review when you think it's ready. Thanks!

rusackas avatar Aug 23 '24 18:08 rusackas

This PR hasn't seen any activity/events in roughly seven months. I'll reluctantly close it, but would love to see it reopened/revisited if anyone here has interest or bandwidth.

rusackas avatar Mar 05 '25 21:03 rusackas