hyades icon indicating copy to clipboard operation
hyades copied to clipboard

Project-level RBAC

Open mfrystacky opened this issue 11 months ago • 35 comments

Current Behavior

Roles currently do not exist in Hyades/Dependency Track. The existing permission model composes of individual permissions aggregated at a global level in the jwt claim and teams as a mode for having a select set of permissions against a set of projects for a set of users.

Proposed Behavior

Hi friends, We are looking into adding roles into the alpine framework and into Hyades as an expansion and enrichment of the permission system. I don't think anyone else is working on this or the permission system so we shouldn't be clobbering anyone's work in progress, but let me know if that isn't the case. We have a new team, and they are exploring how to best approach this without making major revisions to the existing permission model, so we will be updating this as we are moving along and creating PR's of the incremental work moving forward.

Checklist

mfrystacky avatar Jan 22 '25 18:01 mfrystacky

Hey @mfrystacky, I think adding the concept of roles makes sense. I think teams were somewhat meant to implement RBAC-like capabilities, but of course this model doesn't scale that well if you're having loads of teams.

I don't think anyone else is working on this or the permission system so we shouldn't be clobbering anyone's work in progress, but let me know if that isn't the case.

There is https://github.com/stevespringett/Alpine/pull/674 which aims to make it possible to assign permissions to individual API keys. But I doubt your proposed changes will have any overlap there, so you should be good.

We have a new team, and they are exploring how to best approach this without making major revisions to the existing permission model, so we will be updating this as we are moving along and creating PR's of the incremental work moving forward.

Awesome! Let us know if you need anything!

nscuro avatar Jan 24 '25 09:01 nscuro

@nscuro For additional context, I wanted to lay out the ultimate goal we are trying to accomplish. I can create a separate issue if that would be more appropriate.

  • Use a GitLab instance as OIDC provider (already working)
  • Get the user's group memberships within GitLab and corresponding effective access levels (roles such as owner, maintainer, developer) for each of these groups using the GitLab API
    • This requires a combination of info from the /oauth/token and /oauth/userinfo endpoints that is currently not captured by the OidcAuthenticationService, and some of which is specific to GitLab
  • Add a DT project for each GitLab group (if one doesn't exist already)
  • Add a DT team mapping the OIDC group to user's effective access/role within GitLab, e.g. <gitlab project path>-maintainer (the sort of RBAC functionality we're looking for)

We are kicking around the idea of creating a GitLab integration within Hyades. Here is what we have so far (link), just don't pay too much attention to the current code within org.dependencytrack.integrations.gitlab. That was mostly me learning Java and getting familiar with the QueryManager, etc. Does this use case seem like something that an integration can fulfill?

One issue we're running into is that the privacy of OidcAuthenticationService and its members make it difficult to extend it without copy-pasting a bunch of code. Is exposing some of those private classes/interfaces as public something we could consider, or would it break the design philosophy?

/cc @pkwiatkowski1 @ashearin @Strakeln @lmphil @jmayer-lm @EphraimEM

jhoward-lm avatar Jan 28 '25 15:01 jhoward-lm

Get the user's group memberships within GitLab and corresponding effective access levels (roles such as owner, maintainer, developer) for each of these groups using the GitLab API

  • This requires a combination of info from the /oauth/token and /oauth/userinfo endpoints that is currently not captured by the OidcAuthenticationService, and some of which is specific to GitLab

[...]

One issue we're running into is that the privacy of OidcAuthenticationService and its members make it difficult to extend it without copy-pasting a bunch of code. Is exposing some of those private classes/interfaces as public something we could consider, or would it break the design philosophy?

Depending on what exactly you need from OidcAuthenticationService, I would recommend to look into ServiceLoader. You could define a generic "customizer" or "claims extractor" interface in Alpine, and have OidcAuthenticationService discover implementations of it using ServiceLoader. Dependency-Track can then provide implementations of the interface to achieve the desired outcome. This keeps main logic in Dependency-Track.

Alpine has something similar for customization of the MeterRegistry already:

  • https://github.com/stevespringett/Alpine/blob/38cdda1eea6e1bed45ae7e26e305338cd5ca75be/alpine-common/src/main/java/alpine/common/metrics/Metrics.java#L48-L52
  • https://github.com/DependencyTrack/hyades-apiserver/blob/main/src/main/java/org/dependencytrack/observability/MeterRegistryCustomizer.java
  • https://github.com/DependencyTrack/hyades-apiserver/blob/main/src/main/resources/META-INF/services/alpine.common.metrics.MeterRegistryCustomizer

The main chunk of work is coming up with a good interface contract that perhaps allows for more than just GitLab to integrated in a similar fashion. Do you think this could work for your case?

Does this use case seem like something that an integration can fulfill?

An asynchronous task (or workflow in the future) should be able to achieve this.

However, IIUC, you need the users' GitLab access token when performing the synchronization? In that case, what you could do is:

  1. Implement a customizer as mentioned above to acquire the additional data you need (i.e. role).
  2. In that customizer, kick off a GitLabSyncEvent.
    • Extend GitLabSyncEvent to take the user's GitLab access token.
    • Use DataEncryption to encrypt the token if the event has to be transmitted over the wire.
  3. Perform the synchronization logic in the task that handles GitLabSyncEvents.

For completeness, we also have a bare-bones plugin mechanism you could look into.

Example for pluggable file storage:

  • https://github.com/DependencyTrack/hyades-apiserver/tree/workflow-v2/src/main/java/org/dependencytrack/storage
  • https://github.com/DependencyTrack/hyades-apiserver/blob/202e453293c18b3b767591307215e83e6b2ef43c/src/main/java/org/dependencytrack/resources/v1/BomResource.java#L582-L584

The intention is to eventually publish the plugin API to Maven Central, and load plugin JARs from the file system. But this could be expedited if you find it useful in this context. I figure this would be a viable option if your desired solution ends up being too specific to LM.

nscuro avatar Jan 28 '25 16:01 nscuro

@nscuro Great info, thank you! We're exploring the customizer/claims extractor route now

jhoward-lm avatar Jan 28 '25 18:01 jhoward-lm

The main chunk of work is coming up with a good interface contract that perhaps allows for more than just GitLab to integrated in a similar fashion. Do you think this could work for your case?

@nscuro Do you have any recommendations for what the type parameter to Consumer<T> should be? My initial thought was ClaimsSet but I'm not sure if that's correct

@FunctionalInterface
public interface OidcAuthenticationServiceCustomizer extends Consumer<ClaimsSet> {

}

or maybe just the AuthenticationService interface?

@FunctionalInterface
public interface OidcAuthenticationServiceCustomizer extends Consumer<AuthenticationService> {

}

Thanks for all the help and suggestions!

jhoward-lm avatar Jan 28 '25 20:01 jhoward-lm

@jhoward-lm That really depends on what specifically you want to achieve. You are not limited to Consumer either. To give you a very rough idea:

interface OidcAuthenticationCustomizer {

    // Could be necessary if you have special needs for mapping claims
    // to an OidcProfile (i.e. to extract team membership).
    //
    // https://github.com/stevespringett/Alpine/blob/38cdda1eea6e1bed45ae7e26e305338cd5ca75be/alpine-server/src/main/java/alpine/server/auth/OidcAuthenticationService.java#L129-L136
    OidcProfile createProfile(ClaimsSet claimsSet);
    
    // This decision currently drives whether only the ID token,
    // or additionally the /userinfo endpoint is used to authenticate.
    //
    // The default implementation short-circuits if the profile is
    // considered complete based on the ID token alone.
    // https://github.com/stevespringett/Alpine/blob/master/alpine-server/src/main/java/alpine/server/auth/OidcAuthenticationService.java#L210-L214
    //
    // If you need to ALWAYS call the /userinfo endpoint,
    // you obviously need to modify the decision logic.
    bool isProfileComplete(OidcProfile profile, boolean teamSyncEnabled);

    // If the service used both the ID token and the /userinfo endpoint
    // to gather profile information, you need to decide what data to keep.
    //
    // https://github.com/stevespringett/Alpine/blob/38cdda1eea6e1bed45ae7e26e305338cd5ca75be/alpine-server/src/main/java/alpine/server/auth/OidcAuthenticationService.java#L216-L223
    OidcProfile mergeProfiles(OidcProfile left, OidcProfile right);

    // Invoked by OidcAuthenticationService when the authentication succeeded.
    // Here is where your main logic could be. Since the user is authenticated
    // at this point, it's safe to kick off a reconciliation job for them in the background.
    //
    // You could pass in the ID and access token as well.
    void onAuthenticationSuccess(OidcProfile profile);

    // Just for demonstration, might not be needed.
    void onAuthenticationFailure(OidcProfile profile, Exception cause);

    // For conflict resolution, see below.
    int priority();
    
}

You could replace OidcProfile in the above with the raw ClaimsSet as well. See what works best. If you go with OidcProfile, you may need to extend that class to hold additional, generic values, e.g. via Map<String, Object>.

To discover and load such a customizer, you could do it as follows. Note that there can be multiple implementations, and you'll have to choose only one. Again, this is just a rough sketch of how you could do it:

// in OidcAuthenticationService#authenticate

var customizer = ServiceLoader.load(OidcAuthenticationCustomizer.class).stream()
    .map(ServiceLoader.Provider::get)
    .filter(customizer -> {
        // Here's where you could further narrow down which implementation to use.
        // i.e. if the authentication is performed with a GitLab IdP, only choose
        // customizers meant to be used with GitLab.
    })
    // If you have multiple implementations still, order them by priority,
    // and select the first of them. For example, a default implementation provided
    // by Alpine should have a lower prio than a specialized one provided by DT.
    .sorted(Comparator.comparingInt(OidcAuthenticationCustomizer::priority))
    .findFirst()
    .orElseThrow(IllegalStateException::new);

// ...

if (customizer.isProfileComplete(idTokenProfile, teamSyncEnabled)) {

// ...

Alternatively, add a Config.Key that takes the fully qualified class name of the customizer. You then do:

String customizerClassName = Config.getInstance().getProperty(Config.AlpineKey.FOO);
var customizer = ServiceLoader.load(OidcAuthenticationCustomizer.class).stream()
    .filter(provider.type().getName().equals(customizerClassName))
    .map(ServiceLoader.Provider::get)
    .findFirst()
    .orElseThrow(IllegalStateException::new);

Users can then configure this via env vars, i.e. FOO=org.dependencytrack.auth.GitLabOidcAuthenticationCustomizer.

nscuro avatar Jan 29 '25 11:01 nscuro

@nscuro This is way above and beyond the level of assistance we were hoping for, thank you so much!

If going with OidcProfile, its current access for the class is package-private and its members are either explicitly private or package-private. What are your thoughts on setting the access of the class to public and its getters/setters to either public or protected? That is in addition to adding the Map<String, Object> you suggested.

Going to explore this avenue now, thanks again!

jhoward-lm avatar Jan 29 '25 14:01 jhoward-lm

@jhoward-lm Feel free to modify visibility or other internals as necessary.

nscuro avatar Jan 29 '25 14:01 nscuro

@nscuro When you get a chance, would you mind taking a look at the changes I have so far? This more or less covers what we've talked about, but I wanted to get it in front of you early in case I misunderstood anything or it looks like there are too many touch points in the code. I tried to keep the formatter in my editor from taking liberties but there might be some whitespace changes in there; let me know if I need to revert.

A couple of highlights:

  • The methods removed from OidcAuthenticationService were moved into DefaultOidcAuthenticationCustomizer
  • OidcAuthenticationService.authenticateInternal became DefaultOidcAuthenticationCustomizer.onAuthenticationSuccess

Thanks!

jhoward-lm avatar Jan 29 '25 22:01 jhoward-lm

@nscuro Or I could open a draft PR if you prefer

jhoward-lm avatar Jan 30 '25 14:01 jhoward-lm

I did not have the chance to look at it yet, I'll try to give you feedback later today.

nscuro avatar Jan 30 '25 14:01 nscuro

@jhoward-lm This looks quite promising already. But do you need to customize large chunks of the authentication logic itself?

Otherwise I'd suggest to not move authenticateInternal to the customizer. Same for autoProvision.

Wouldn't it suffice to simply invoke customizer.onAuthenticationSuccess here (and other locations where authentication completed successfully)?

Edit: Writing this made me realize that commenting on a PR would make this easier. If you could raise one for that purpose, that would be great.

nscuro avatar Jan 30 '25 16:01 nscuro

@nscuro Those are good suggestions, I'll try to rework. Opened a draft PR with these changes. Thanks for all the help!

jhoward-lm avatar Jan 30 '25 16:01 jhoward-lm

@nscuro returning to the core task of adding roles as a way of enhancing access control to allow for project scoped permissions and avoiding globally aggregated permissions for a user. We've drawn up some preliminary diagrams to illustrate our proposed changes and would love some feedback on them.

Proposed Tables

ERD

edit: adjust layout of proposed tables to enhance readability edit2: Correct Typos in ERD

ashearin avatar Feb 17 '25 19:02 ashearin

@ashearin The model makes sense.

Few questions:

  • Do I understand it correctly that this is in addition to the existing PROJECT_ACCESS_TEAMS model, and this is solely for a more fine-grained way to assign permissions?
  • How do we compute the "effective permissions" on a project-level?
    • i.e., what if a user has PORTFOLIO_MANAGEMENT permissions globally, but only VIEW_PORTFOLIO for a given project?
  • How would you want this behave WRT project hierarchies?
    • i.e., should roles be inherited from parent projects?
    • See also https://github.com/DependencyTrack/hyades/issues/1674.

nscuro avatar Feb 18 '25 15:02 nscuro

Do I understand it correctly that this is in addition to the existing PROJECT_ACCESS_TEAMS model, and this is solely for a more fine-grained way to assign permissions?

Correct.

How do we compute the "effective permissions" on a project-level? i.e., what if a user has PORTFOLIO_MANAGEMENT permissions globally, but only VIEW_PORTFOLIO for a given project?

I'm sure Allen will have his own thoughts on it, but just my opinion here. My expectation for permissions behavior would probably be something like "closest scope wins", or "most explicit wins". i.e. in order of descending precedence:

  • User permissions
  • Team permissions
  • Role permissions

Then permissions are merged into an effective access for a given project. I would need to see how the permission levels in DT are organized and inherited, i.e. ACCESS_MANAGEMENT automatically implies VIEW_PORTFOLIO, etc.

How would you want this behave WRT project hierarchies? i.e., should roles be inherited from parent projects?

How is portfolio access control permission inheritance handled currently? My first inclination would be to align with that.

Of course, this whole effort is in support of mapping GitLab project roles to Dependency Track projects and scoped permissions, and in their model, permissions at the group level are inherited for all projects, and the highest access level of the permissions tree for the project wins.

For example, if a user is assigned Maintainer on the root group, then Owner on a subgroup, then Developer on a project within the subgroup, I believe the effective access would be Owner.

jhoward-lm avatar Feb 18 '25 16:02 jhoward-lm

I would need to see how the permission levels in DT are organized and inherited, i.e. ACCESS_MANAGEMENT automatically implies VIEW_PORTFOLIO, etc.

At the moment, ACCESS_MANAGEMENT bypasses ACL checks. Logic being that with it, you can assign any permission to yourself anyway. In a way it acts as a flag to identify admins, if you will.

This check for ACCESS_MANAGEMENT is hardcoded, which is sub-optimal. Having some generic way of defining "implies" relationship would be awesome, but might qualify as scope creep here.

How is portfolio access control permission inheritance handled currently?

Historically it did not take hierarchy into consideration, because it was too expensive to compute for many projects at once, e.g. when listing all projects. A technical limitation and a known feature gap.

I am currently in the process of changing that:

  • https://github.com/DependencyTrack/hyades/issues/1674
  • https://github.com/DependencyTrack/hyades-apiserver/pull/1060

nscuro avatar Feb 18 '25 16:02 nscuro

@nscuro I opened a Draft PR with our initial pass at implementing the diagrams we linked. Currently only adds the role and mapped role classes with needed relational tables and a migration file update.

We're still working on associated changes needed to enforce permissions scoped to the Project level, but wanted to get some feedback as we progress.

ashearin avatar Feb 24 '25 18:02 ashearin

I think the biggest problem ATM that needs solving is: How do we make access checks perform well?

What's problematic about the model proposed in https://github.com/DependencyTrack/hyades/issues/1632#issuecomment-2663960532, is that it must be computed in-memory, and involves many joins. That is doable for cases where users access a specific project, but not when we need to filter all projects down to those accessible by the user.

As discussed recently, another challenge is the disconnect between PROJECT_ACCESS_TEAMS and the new roles concept. A user could have the Developer role on a project, but no PROJECT_ACCESS_TEAMS entry - granting them "edit" permissions while being unable to even access the project.

Since we're kind of aiming to provide a GitLab-like experience, I looked at how GitLab handles this. Perhaps it helps us with discussions.

  • GitLab has a hardcoded set of roles ("access levels"). Access levels map to integers, which allows for database-level ordering. Some more general info on the access model can be found here.
  • It maintains a project_authorizations table where users are associated with projects they're allowed to access, including their access level. It is similar to DT's PROJECT_ACCESS_TEAMS table.
  • When performing a CRUD action on projects, GitLab provides a minimum access level required to perform this action. For example here, it retrieves all projects the authenticated user has MAINTAINER access level on.
  • The check above leads us to this method in its Project model:
    scope :visible_to_user_and_access_level, ->(user, access_level) { where(id: user.authorized_projects.where('project_authorizations.access_level >= ?', access_level).select(:id).reorder(nil)) }
    
    (Note how the access_level condition can make use of >=, meaning it will also pass for higher access levels such as OWNER or ADMIN)
  • Admins can define custom roles, however those roles must refer to a built-in access level. This means that access checks as shown above will always continue to work.
    • GitHub is doing the same, where custom roles must inherent from any of the built-ins, too.

In DT terms:

  • Entry in PROJECT_ACCESS_TEAMS + VIEW_PORTFOLIO permission corresponds to GitLab's GUEST role.
  • Entry in PROJECT_ACCESS_TEAMS + PORTFOLIO_MANAGEMENT permission (roughly) corresponds to GitLab's DEVELOPER role.
  • etc.

I really like the concept of predefined, sortable access levels. Storing the access level in the join table makes baseline access checks super efficient. I feel like this is a model we could take inspiration from.

WDYT?

nscuro avatar Mar 15 '25 18:03 nscuro

Seems like an interesting solution!

What's problematic about the model proposed in https://github.com/DependencyTrack/hyades/issues/1632#issuecomment-2663960532, is that it must be computed in-memory, and involves many joins. That is doable for cases where users access a specific project, but not when we need to filter all projects down to those accessible by the user.

Would careful/thoughtful design leveraging indexes or materialized views mitigate this somewhat?

As discussed recently, another challenge is the disconnect between PROJECT_ACCESS_TEAMS and the new roles concept. A user could have the Developer role on a project, but no PROJECT_ACCESS_TEAMS entry - granting them "edit" permissions while being unable to even access the project.

To solve for this, just for testing locally currently, in UserResource.validateCredentials I threw in a query to check if a user has a role on any project; if so, it adds VIEW_PORTFOLIO to the JWT permissions key. Not sure if that's a good long term solution, but is it worth considering? As far as I could tell, subsequent permissions checks went through the database after initial authorization, but I could be mistaken on that.

Entry in PROJECT_ACCESS_TEAMS + VIEW_PORTFOLIO permission corresponds to GitLab's GUEST role.

Does this mean VIEW_PORTFOLIO assigned to the user directly? Would this be as a replacement to the Role construct?

Tangentially related question: for permissions like PORTFOLIO_MANAGEMENT that are broken down into PORTFOLIO_MANAGEMENT_READ, PORTFOLIO_MANAGEMENT_UPDATE, etc, should that be separated/normalized into distinct concepts such as resource and access level? i.e. a table like RESOURCE containing entries like PORTFOLIO, VULNERABILITY, and so on, and a table like ACCESS_LEVEL that would just contain CREATE, READ, UPDATE, and DELETE? It would mean more joins but I believe the current PERMISSION table isn't considered normalized otherwise (I could be mistaken on that as well).

jhoward-lm avatar Mar 15 '25 18:03 jhoward-lm

Would careful/thoughtful design leveraging indexes mitigate this somewhat?

I mean yes, but thoughtful design is also somewhat of a wildcard 😅. IMO the benchmark for this should be listing of projects a user has access to. If this works, and performs well, all more granular checks will be trivial.

I tried my hand here: https://sqlfiddle.com/postgresql/online-compiler?id=855308a7-063a-4b42-afc4-ed9e2d1e48f7

For completeness, here's what I came up with using a correlated subquery:

-- What projects can Alice see?
select name as alices_projects
  from project
 where exists (
   select 1
     from managedusers_project_access_roles as mpar
    inner join project_access_roles as par
       on par.id = mpar.project_access_roles_id
    inner join roles_permissions as rp
       on rp.role_id = par.role_id
    inner join permission as p
       on p.id = rp.permission_id
    where p.name = 'PORTFOLIO_MANAGEMENT_READ'
      and par.project_id = project.id
      and mpar.manageduser_id = 1
);

And here's a variant without subquery:

-- What projects can Bob see?
select project.name as bobs_projects
  from project
 inner join project_access_roles as par
    on par.project_id = project.id
 inner join managedusers_project_access_roles as mpar
    on mpar.project_access_roles_id = par.id
 inner join roles_permissions as rp
    on rp.role_id = par.role_id
 inner join permission as p
    on p.id = rp.permission_id
 where p.name = 'PORTFOLIO_MANAGEMENT_READ'
   and mpar.manageduser_id = 2;

In both cases we have 3-4 joins for the access check alone.

  • The former variant can short-circuit, but is executed for every project row.
  • The latter variant could produce duplicate rows if a user has multiple roles with the requested permission.

Obviously testing with lots more data is necessary to gauge whether the performance penalty is too high, but many joins are a risk factor.

Does this mean VIEW_PORTFOLIO assigned to the user directly? Would this be as a replacement to the Role construct?

Sorry, I was just trying to roughly outline how the two approaches relate. Did not intend to give an implementation suggestion. As it stands, it would not replace the role concept entirely.

for permissions like PORTFOLIO_MANAGEMENT that are broken down into PORTFOLIO_MANAGEMENT_READ, PORTFOLIO_MANAGEMENT_UPDATE, etc, should that be separated/normalized into distinct concepts such as resource and access level?

I would not normalize just for normalization's sake. If there is no need for more attributes on resources, I'd rather use the current form and avoid more joins. If we wanted a model that enables us to query for "has user at least READ access on resource PORTFOLIO", we could achieve the same in a denormalized PERMISSION table:

id name resource access_level
1 PORTFOLIO_MANAGEMENT_READ PORTFOLIO READ
2 VULNERABILITY_MANAGEMENT_UPDATE VULNERABILITY UPDATE

access_level could even be an enum, allowing for ordering. So the first example query could become:

select name as alices_projects
  from project
 where exists (
   select 1
     from managedusers_project_access_roles as mpar
    inner join project_access_roles as par
       on par.id = mpar.project_access_roles_id
    inner join roles_permissions as rp
       on rp.role_id = par.role_id
    inner join permission as p
       on p.id = rp.permission_id
    where p.resource = 'PORTFOLIO'
      and p.access_level >= 'READ'
      and par.project_id = project.id
      and mpar.manageduser_id = 1
);

nscuro avatar Mar 15 '25 20:03 nscuro

Just saw you also mentioned materialized views. The reservation I have for them is that we'd need to refresh the entire view every time permissions, roles, role assignments etc. change. MVs don't support partial/incremental refreshes yet.

But we can use triggers and denormalization if needed. We have a pending change that does something along those lines to make project hierarchies faster to query: https://github.com/DependencyTrack/hyades/blob/issue-1699/docs/architecture/decisions/005-materialize-project-hierarchies.md

nscuro avatar Mar 15 '25 20:03 nscuro

One thing I've been attempting is to flatten the proposed tables so that the PROJECT_ACCESS_ROLES abstraction table is removed and each user type table becomes something like

CREATE TABLE IF NOT EXISTS "LDAPUSERS_PROJECTS_ROLES" (
    "LDAPUSER_ID" bigint NOT NULL,
    "ROLE_ID" bigint NOT NULL, 
    "PROJECT_ID" bigint NOT NULL
);

-- foreign key, index, constraint definitions

instead of

CREATE TABLE public."PROJECT_ACCESS_ROLES" (
    "ID" bigint NOT NULL,
    "PROJECT_ID" bigint NOT NULL,
    "ROLE_ID" bigint NOT NULL
);

-- foreign key, index, constraint definitions

CREATE TABLE public."LDAPUSERS_PROJECTS_ROLES" (
    "LDAPUSER_ID" bigint NOT NULL,
    "PROJECT_ACCESS_ROLE_ID" bigint NOT NULL
);

-- foreign key, index, constraint definitions

although I struggled to figure out the correct DataNucleus annotations and got errors during initialization/migration. It would remove one join per query if nothing else.

If I can get it working, would you prefer that implementation? While still exploring your suggestions above in parallel of course.

jhoward-lm avatar Mar 15 '25 21:03 jhoward-lm

Eliminating a join like this would be great. We'd then have something akin to GitLab's project_authorizations table.

You could even merge all the user tables like so:

create table user_project_role (
  id bigint primary key
, ldapuser_id bigint references ldapuser(id)
, manageduser_id bigint references manageduser(id)
, oidcuser_id bigint references oidcuser(id)
, project_id bigint not null references project(id)
, role_id int bigint not null references role(id)
);

A check constraint can ensure that only ldapuser_id or manageduser_id or oidcuser_id are set.

although I struggled to figure out the correct DataNucleus annotations and got errors during initialization/migration.

If you like I can have a look if you have code pushed somewhere or share the errors you're getting.

nscuro avatar Mar 15 '25 21:03 nscuro

I'm actually not seeing the error that I was previously seeing, so looks like it's a viable option! I'll make those changes in the next couple of days and get it pushed for you to review

jhoward-lm avatar Mar 15 '25 22:03 jhoward-lm

Few more thoughts:

  • We should add validation to prevent system-level permissions from being added to roles. Since roles are bound to projects, letting users assign the SYSTEM_CONFIGURATION permission or similar would be confusing.
  • We should rename "Role" to "ProjectRole", to make their purpose / limitation more obvious.
  • We could effectively eliminate the PROJECT_ACCESS_TEAMS table if we allowed project roles to be assigned to teams as well.
    • This would be similar to GitHub (and I assume GitLab, too): Image
    • IDs of teams the authenticated user is a member of are already calculated for the existing ACL logic, and can be used as query parameter.
  • We could investigate pre-computing effective project permissions for every user, for each project. We would do this based on the proposed data model, not instead of it.
    • create table user_project_effective_permissions (
        ldapuser_id bigint references ldapuser(id)
      , manageduser_id bigint references manageduser(id)
      , oidcuser_id bigint references oidcuser(id)
      , project_id bigint not null references project(id)
      , permission_id bigint not null references permission(id)
      , permission_name text not null
      );
      
      -- What projects can Alice see?
      select name as alices_projects
       from project
      where exists (
        select 1
          from user_project_effective_permissions
         where permssion_name = 'PORTFOLIO_MANAGEMENT_READ'
           and project_id = project.id
           and manageduser_id = 1
      );
      
      -- Because effective permissions are de-duplicated,
      -- we can use a simple join without causing duplicate project records.
      select name as alices_projects
        from project
       inner join user_project_effective_permissions as upep
          on upep.project_id = project.id
       where upep.manageduser_id = 1
         and upep.permssion_name = 'PORTFOLIO_MANAGEMENT_READ';
      
    • Assumption: Users, teams, roles, and permissions are modified a lot less frequently than permissions are being read.
    • Avoids the joins with the role and permission tables.
    • Update this table whenever a role is assigned to a user or team.
    • We can experiment with this model by making user_project_effective_permissions a view first.

nscuro avatar Mar 17 '25 12:03 nscuro

During the last maintainer's meeting, @stevespringett mentioned the option of us "ingesting" Alpine, which would give us more flexibility when it comes to changing the underlying model. We could do this as part of #1710 (see https://github.com/DependencyTrack/hyades/issues/1710#issuecomment-2730044410).

nscuro avatar Mar 17 '25 16:03 nscuro

@nscuro This is my attempt at flattening the proposed tables as mentioned above (commit link). Mind taking a look and letting me know what you think?

jhoward-lm avatar Mar 17 '25 17:03 jhoward-lm

For this

create table user_project_effective_permissions (
  ldapuser_id bigint references ldapuser(id)
, manageduser_id bigint references manageduser(id)
, oidcuser_id bigint references oidcuser(id)
, project_id bigint not null references project(id)
, permission_id bigint not null references permission(id)
, permission_name text not null
);

I'm assuming there is the previously mentioned check constraint that one of ldapuser_id, manageduser_id, or oidcuser_id must be set?

Do you think it would be a performance hit to leverage interfaces so that these could be consolidated into a single text column such as user_principal_id? I believe they are represented as "<class name>:<FK ID>", e.g. "alpine.model.OidcUser:22"

jhoward-lm avatar Mar 17 '25 17:03 jhoward-lm

I'm assuming there is the previously mentioned check constraint that one of ldapuser_id, manageduser_id, or oidcuser_id must be set?

Yep. Naive way to do it:

check (
  (ldapuser_id is not null)::int
    + (manageduser_id is not null)::int
    + (oidcuser_id is not null)::int = 1
)

Do you think it would be a performance hit to leverage interfaces so that these could be consolidated into a single text column such as user_principal_id? I believe they are represented as "<class name>:<FK ID>", e.g. "alpine.model.OidcUser:22"

Admittedly I never used that particular feature, but what immediately comes to mind is that this column format voids the ability to use FK constraints. Although in the paragraph below the section you linked, it says:

The default mapping strategy for interface fields and collections of interfaces is to have separate FK column(s) for each possible implementation of the interface.

nscuro avatar Mar 17 '25 18:03 nscuro