oso icon indicating copy to clipboard operation
oso copied to clipboard

Oso 0.26 - Issue with entities appearing multiple times in a relation

Open lafrech opened this issue 2 years ago • 6 comments

Hi.

I'm still having issues migrating to 0.26.

I modified my code to use

from polar.data.adapter.sqlalchemy_adapter import SqlAlchemyAdapter

It won't work with 0.25 (#1520) so I can't tell for sure if the difference comes from the use of SqlAlchemyAdapter or from 0.25 vs. 0.26.


Apparently there has been an important rework in the way the SQL queries are built. I can see that 0.26 generates simpler queries where 0.25 uses subqueries.

A first consequence of this is that when building upon a query generated with authorized_query I had issues if I added joins with tables that were already used inside the authorization logic, therefore already used in the query.

This would take the form of

sqlalchemy.exc.ProgrammingError: (psycopg2.errors.DuplicateAlias) ERROR: table name "xxx" specified more than once

I solved this by aliasing the tables I join.

Basically,

    authorized_query(user, "read", Team).join(UserTeamAssocTable)

becomes

    authorized_query(user, "read", Team).join(sqla.orm.aliased(UserTeamAssocTable))

I'm not proficient with SQLAlchemy but I'm thinking perhaps Oso could alias tables in the first place so that the user could use classes directly without bothering. It is nice (in fact, fantastic) that Oso does the SQL stuff for us but having to alias tables used in the auth logic kind of makes the abstraction leak. To be on the safe side, I can alias all tables in joins appended to Oso-generated queries, but it is a pity.


Now I'm facing another issue that might be related.

I have a model in which User relates to both A and B, and A and B themselves are related. I define permissions allowing the user to read AxB association table. The rule is that if user has read permission on both A and B, they can read AxB association.

I'm getting this error:

polar.exceptions.OperationalError: Invalid state: Type `User` occurs more than once as the target of a relation

I printed a trace but it is a bit hard for me to unravel. Anyway, my guess is that user appears twice because it is involved in both userxA and userxB relations.

Here's a trace line before the error:

[oso][info]     resource: _this matches TimeseriesClusterGroupByCampaign{} and _this.campaign matches Campaign{} and _ubc_447 in _this.campaign.users_by_campaigns and <User Chuck <[email protected]>, admin: True, active: True> TYPE `UserActor` = _ubc_447.user and _this.timeseries_cluster_group matches TimeseriesClusterGroup{} and _tscgbu_461 in _this.timeseries_cluster_group.timeseries_cluster_groups_by_users and <User Chuck <[email protected]>, admin: True, active: True> TYPE `UserActor` = _tscgbu_461.user

I'm thinking it could be related to the first issue because in both cases an entity appears twice. In the first case, the second appearance appears in my code and I get an SQLAlchemy error, while here it all happens in Oso and I get a polar error.

I didn't take the time to build a minimal reproducible example and I hope I'm (making sense and I'm) giving enough details.

I don't mind linking to the project if it helps, it is public anyway, but I don't expect someone to be willing to dive into it.

Anything else I could provide?

Thanks.

lafrech avatar Feb 17 '22 17:02 lafrech

Hey, thanks for the super detailed issue. The table alias problem is one I hadn't thought about before but totally makes sense.

The other one is a limitation of the new data filtering right now. It can only join to each table once. We want to expand that in the future but that's how it is right now. That error is a little confusing, why is it trying to join to User more than once? If instead of relations from resources to users you put all those relations on the user to resources it might be possible to work around this.

saolsen avatar Feb 17 '22 19:02 saolsen

The table alias problem is one I hadn't thought about before but totally makes sense.

Good. Well, if you can improve this, it's great. The user experience would be better. Meanwhile, we can live with it. We just need to alias the join we append to the query.

That error is a little confusing, why is it trying to join to User more than once? If instead of relations from resources to users you put all those relations on the user to resources it might be possible to work around this.

I'm not sure I understand that part, and I'd surely appreciate a hint.

The polar file is here.

Relevant part would be:

# User is a resource and an actor, so I define UserActor for the actor
actor UserActor {}

resource User {
    permissions = ["create", "read", "update", "delete", "set_admin", "set_active"];
    roles = ["self"];

    "read" if "self";
    "update" if "self";
}
has_role(_user: UserActor{id: id}, "self", _user: User{id: id});

# -----------------------------------------------------------------------------------------

# Campaign table
resource Campaign {
    permissions = ["create", "read", "update", "delete"];
    roles = ["c_member"];

    "read" if "c_member";
}
has_role(user: UserActor, "c_member", campaign: Campaign) if
    ubc in campaign.users_by_campaigns and
    user = ubc.user;

# User x Campaign assoc table: user can read its own associations
resource UserByCampaign {
    permissions = ["create", "read", "update", "delete"];
    roles = ["ubc_owner"];

    "read" if "ubc_owner";
}
has_role(user: UserActor, "ubc_owner", ubc: UserByCampaign) if
    user = ubc.user;

# -----------------------------------------------------------------------------------------

# TSCG table
resource TimeseriesClusterGroup {
    permissions = ["create", "read", "update", "delete"];
    roles = ["tscg_member"];

    "read" if "tscg_member";
}
has_role(user: UserActor, "tscg_member", tg: TimeseriesClusterGroup) if
    tscgbu in tg.timeseries_cluster_groups_by_users and
    user = tscgbu.user;

# User x TSCG assoc table: user can read its own associations
resource TimeseriesClusterGroupByUser {
    permissions = ["create", "read", "update", "delete"];
    roles = ["tscgbu_owner"];

    "read" if "tscgbu_owner";
}
has_role(user: UserActor, "tscgbu_owner", tscgbu: TimeseriesClusterGroupByUser) if
    user = tscgbu.user;

# -----------------------------------------------------------------------------------------

# TSCG x Campaign assoc table: user can read if associated with both Campaign and TSCG
resource TimeseriesClusterGroupByCampaign {
    permissions = ["create", "read", "update", "delete"];
}
has_permission(user: UserActor, "read", tgbc: TimeseriesClusterGroupByCampaign) if
    has_role(user, "c_member", tgbc.campaign) and
    has_role(user, "tscg_member", tgbc.timeseries_cluster_group);

The admin may associate users and campaigns, campaigns and TSCG, and users and TSCG.

The user may read Campaign x TSCG associations if he is a member of (i.e. associated with) both the campaign and the TSCG. This is how I specify the rules. And I believe this is why the query builder joins twice on user.

Maybe the way I build the polar file is wrong. It is mostly based on trial and error + extensive unit tests.

If instead of relations from resources to users you put all those relations on the user to resources it might be possible to work around this.

Oh, you mean the Relations in register_class should be in User class rather than Campaign, UserByCampaign and all? I could try that, see how it goes.

Thanks.

lafrech avatar Feb 17 '22 23:02 lafrech

Oh, you mean the Relations in register_class should be in User class rather than Campaign, UserByCampaign and all? I could try that, see how it goes.

Yeah I believe this is what @saolsen was getting at — that you could invert the relations so that you're going from, e.g., UserActor -> TimeseriesClusterGroupByUser instead of from TimeseriesClusterGroupByUser -> UserActor. Were you able to make any progress on this issue?

gj avatar Feb 25 '22 16:02 gj

Hi. Thanks for the help on this.

I just changed

has_role(user: User, "c_member", campaign: Campaign) if
    ubc in campaign.users_by_campaigns and
    user = ubc.user;

into

has_role(user: User, "c_member", campaign: Campaign) if
    ubc in user.users_by_campaigns and
    campaign.id = ubc.campaign_id;

and the test now passes.

Good. Yet, I have the feeling that I'm getting away with it this time but the same issue might strike again anytime and it won't be obvious why it doesn't work and what relation will work, if any. As @saolsen said, this is a workaround.

In other words, I consider the problem still exists, but the workaround at least allows me to move on. This and #1536 currently prevent me from moving to 0.26 / SqlAlchemyAdapter and I don't want to waste your time reporting issues with the deprecated query builder, so I'm kinda stuck if I have Oso related issues in my app dev branch that I can't test against 0.26.

lafrech avatar Feb 27 '22 23:02 lafrech

As I expected, I just got hit by this again.

My model has this kind of loop / diamond relation (those loops always make me cringe but that how it is):

           A1 - A2                                         
          /       \                                        
    U - UG         AB                                      
          \       /                                        
           B1 - B2                                         

User groups associate users with both A1 and B1 objects, which defines user access to A2 and B2 respectively. The permissions on AB association table depend on both permissions on A2 and B2 and I'm getting the dreaded

polar.exceptions.OperationalError: Invalid state: Type `UserGroup` occurs more than once as the target of a relation

Nothing new and surprising, I'm afraid. This time, I could get away with it because a business rule implies that read permissions on A involves read permissions on B so I could skip the check on B, but this is very specific and I'm afraid as the model grows, this will become a real blocker.

The other one is a limitation of the new data filtering right now. It can only join to each table once. We want to expand that in the future but that's how it is right now.

I assume the case above is pretty common in real-life models so this must be a major drawback, isn't it?

I guess there are technical reasons for this current limitation and I totally understand Oso is young and you can't get the whole thing done right away, but I'm curious about your roadmap, more specifically, is this a high-priority feature you aim at supporting soon or is it a low-priority improvement that might be achieved some day eventually?

Or is this actually a corner case I could avoid with a better model or a better use of Oso?

lafrech avatar Mar 25 '22 16:03 lafrech

It is a pretty common thing for people to run into so is a big drawback of the current data filtering. We're focused on some other things right now so I don't really have a good answer for when we'll get to it.

saolsen avatar Apr 15 '22 20:04 saolsen