cartography icon indicating copy to clipboard operation
cartography copied to clipboard

Support relationship property mapping from iterable dicts in one-to-many scenarios

Open jychp opened this issue 7 months ago • 5 comments

Problem

The recent addition of the one_to_many flag in relationships has improved the usability of the data model for request ingestion. However, it doesn't fully address a specific use case:

Connecting an entity A to multiple entities B while also setting a property on the relationship.

Example

In a raw request, PagerDuty creates a relationship like:

(u:PagerDutyUser)-[r:MEMBER_OF]->(t:PagerDutyTeam)

At the same time, it assigns a value to r.role using a variable.

Currently, this behavior isn't possible with the data model.


Limitation

Right now, the target of a PropertyRef with one_to_many=True must be a list of strings or ints. But what we need is support for a list of dicts.

Desired Use Case

@dataclass(frozen=True)
class PagerDutyTeamToUserProperties(CartographyRelProperties):
    lastupdated: PropertyRef = PropertyRef("lastupdated", set_in_kwargs=True)
    role: PropertyRef = PropertyRef("member.role")

@dataclass(frozen=True)
# (:PagerDutyUser)-[:MEMBER_OF]->(:PagerDutyTeam)
class PagerDutyTeamToUserRel(CartographyRelSchema):
    target_node_label: str = "PagerDutyUser"
    target_node_matcher: TargetNodeMatcher = make_target_node_matcher(
        {
            "id": PropertyRef(
                "member.id", 
                one_to_many=True, 
                iterate_dict=("members", "member")
            )
        }
    )
    direction: LinkDirection = LinkDirection.INWARD
    rel_label: str = "MEMBER_OF"
    properties: PagerDutyTeamToUserProperties = PagerDutyTeamToUserProperties()

The syntax probably needs refinement, but the core idea is to introduce a flag or attribute that tells the data model:

“For this relationship, iterate over a list of dicts, and expose each dict element as a variable (e.g., member).”

This would give us the flexibility needed to map both nodes and relationship properties in one pass.


Technical Considerations

Supporting this feature would likely require handling node matchers differently depending on whether a PropertyRef uses this option:

  • Without the option: Regular MATCH clause.
  • With the option: The MATCH clause needs to be nested within an UNWIND.

jychp avatar May 27 '25 07:05 jychp

cc @achantavy

jychp avatar May 27 '25 07:05 jychp

Linked to https://github.com/cartography-cncf/cartography/pull/1606

jychp avatar Jun 02 '25 08:06 jychp

Having the same issue when trying to implement Infisical.

Infisical is even more tricky as role can be custom, so we cannot create a custom relationship for each role as it is done for GitHub.

jychp avatar Jun 16 '25 22:06 jychp

For Infisical, can you share how you would model it if we did support one-to-many rel-properties - vs. - how you would model it today under the current limitations?

achantavy avatar Jun 16 '25 23:06 achantavy

Option 1 – Legacy approach

This is how things are currently done in some modules (e.g., PagerDuty):

(NodeA)-[r:REL]-(NodeB)
# with r.role = str

This pattern doesn’t support one-to-many ingestion, as described earlier.


Option 2 – Multiple relationships

This approach is used in modules like GitHub and Tailscale:

graph LR
U(User) -- ADMIN_OF --> G(Group)
U -- MEMBER_OF --> G
U -- X_OF --> G

This supports one-to-many ingestion, but comes with trade-offs:

  • Requires transformation logic
  • Can’t handle custom roles (since each relationship must be predefined in the schema — GitHub, for instance, has custom roles we can’t currently support)
  • Adds complexity to the graph

Option 3 – Intermediate node

An alternative is to introduce an intermediate Membership node:

graph LR  
U(User) -- HAS --> M{{Membership}}
M -- WITH --> R(Role)
M -- IN --> G(Group)

After some consideration, I believe Option 3 is significantly better. While it introduces some additional complexity, it brings several important benefits:

  • Custom roles are supported out of the box
  • The pattern is reusable across modules, which will simplify future IAM/ACL processing
  • It can be extended to support IndirectMembership, with automatic resolution during analysis
  • It can later be adapted for DynamicMembership, which would be especially useful for modules like Cloudflare Zero Trust

Note: The following examples are just conceptual ideas and would need to be discussed in separate issues or design proposals.

Example: Indirect membership extension (future possibility)

graph LR
U(User) -- HAS --> M{{Membership}}
M -- WITH --> R(Role)
M -- IN --> G(Group)
G -- HAS --> M2{{Membership}}
M2 -- WITH --> R2(OtherRole)
M2 -- IN --> PG(ParentGroup)
U -- HAS --> IM(IndirectMembership)
IM -- WITH --> R2
IM -- IN --> PG

Example: Handling dynamic or conditional membership

graph LR
U(User) -- LOCATED_IN --> L(France)
DM{{DynamicMembership}} -- IF --> L
DM -- WITH --> R(Role)
DM -- IN --> G(Group)
U -- MATCH --> DM

jychp avatar Jun 17 '25 07:06 jychp

I understand the direction of option 3 but it's a pretty big departure from how cartography is currently modeled where each node (for the most part) corresponds with an id from a real API object. I'll need a bit of time to think about it but am eager to hear others' thoughts.

I am building an abstraction around the resource-permissions AWS sync so that we can generalize attaching 2 nodes together, potentially this can help with the scenario you're describing here. More to come soon

achantavy avatar Jul 02 '25 00:07 achantavy

Clearly I only see this for permissions things so if you are working on something will be happy to see it before choosing another approach.

jychp avatar Jul 02 '25 11:07 jychp

Clearly I only see this for permissions things so if you are working on something will be happy to see it before choosing another approach.

jychp avatar Jul 02 '25 11:07 jychp