spring-data-neo4j icon indicating copy to clipboard operation
spring-data-neo4j copied to clipboard

[Performance Issue] Generic relationship does not honor type when query

Open abccbaandy opened this issue 1 year ago • 3 comments

I have some node

@Data
@Node
public abstract class BaseNode {
    @Id
    @GeneratedValue
    private UUID id;
}

@EqualsAndHashCode(callSuper = true)
@Node
@Data
public class Child extends BaseNode{
}

@EqualsAndHashCode(callSuper = true)
@Node
@Data
@NoArgsConstructor
@AllArgsConstructor
@Builder
@ToString(callSuper = true)
public class Parent1 extends BaseNode{
    @Relationship(type = "Parent1_CONTAIN", direction = Relationship.Direction.OUTGOING)
    private List<BaseRelationship<BaseNode>> parent1Relationships;
}

@EqualsAndHashCode(callSuper = true)
@Node
@Data
@NoArgsConstructor
@ToString(callSuper = true)
public class Parent2 extends BaseNode {
    @Relationship(type = "Parent2_CONTAIN", direction = Relationship.Direction.OUTGOING)
    private List<BaseRelationship<BaseNode>> parent2Relationships;
}

When I get parent1 with find all:

        List<Parent1> all = parent1Repository.findAll();

The log shows

MATCH (parent1:`Parent1`:`BaseNode`) WITH collect(elementId(parent1)) AS __sn__ RETURN __sn__
MATCH (parent1:`Parent1`:`BaseNode`) OPTIONAL MATCH (parent1)-[__sr__:`Parent1_CONTAIN`]->(__srn__:`BaseNode`) WITH collect(elementId(parent1)) AS __sn__, collect(elementId(__srn__)) AS __srn__, collect(elementId(__sr__)) AS __sr__ RETURN __sn__, __srn__, __sr__
MATCH (baseNode:`BaseNode`) WHERE elementId(baseNode) IN $__ids__ OPTIONAL MATCH (baseNode)-[__sr__:`Parent2_CONTAIN`]->(__srn__:`BaseNode`) WITH collect(elementId(baseNode)) AS __sn__, collect(elementId(__srn__)) AS __srn__, collect(elementId(__sr__)) AS __sr__ RETURN __sn__, __srn__, __sr__
MATCH (baseNode:`BaseNode`) WHERE elementId(baseNode) IN $__ids__ OPTIONAL MATCH (baseNode)-[__sr__:`Parent1_CONTAIN`]->(__srn__:`BaseNode`) WITH collect(elementId(baseNode)) AS __sn__, collect(elementId(__srn__)) AS __srn__, collect(elementId(__sr__)) AS __sr__ RETURN __sn__, __srn__, __sr__
MATCH (rootNodeIds:`Parent1`) WHERE elementId(rootNodeIds) IN $rootNodeIds WITH collect(rootNodeIds) AS n OPTIONAL MATCH ()-[relationshipIds]-() WHERE elementId(relationshipIds) IN $relationshipIds WITH n, collect(DISTINCT relationshipIds) AS __sr__ OPTIONAL MATCH (relatedNodeIds) WHERE elementId(relatedNodeIds) IN $relatedNodeIds WITH n, __sr__ AS __sr__, collect(DISTINCT relatedNodeIds) AS __srn__ UNWIND n AS rootNodeIds WITH rootNodeIds AS parent1, __sr__, __srn__ RETURN parent1 AS __sn__, __sr__, __srn__

In the query log, shows it search for Parent2_CONTAIN but it shouldn't, because Parent2_CONTAIN is not in Parent1 node. In real case, if I have 10 node extends the base node, it will end up query all 10 node's relationship, I think it is a performance issue.

Change to Child still have this issue

    @Relationship(type = "Parent1_CONTAIN", direction = Relationship.Direction.OUTGOING)
    private List<BaseRelationship<Child>> parent1Relationships;

Also, I think this issue is associate https://github.com/spring-projects/spring-data-neo4j/issues/2933

abccbaandy avatar Aug 20 '24 09:08 abccbaandy

The person1Repository will query for :Parent1_CONTAIN nodes defined by parent1Relationships in Parent1. Those relationships are of type BaseNode. The next iteration will than query for :Parent2_CONTAIN and :Parent1_CONTAIN defined in Parent2 and Parent1 as the implementations of BaseNode. This behaviour is expected.

Looking at the other issue and the Child typed relationship, the same behaviour applies because the type definition is derived from the class public class BaseDbRelationship<T extends BaseNode> itself which is again "just" a BaseNode.

meistermeier avatar Aug 20 '24 19:08 meistermeier

The person1Repository will query for :Parent1_CONTAIN nodes defined by parent1Relationships in Parent1. Those relationships are of type BaseNode. The next iteration will than query for :Parent2_CONTAIN and :Parent1_CONTAIN defined in Parent2 and Parent1 as the implementations of BaseNode. This behaviour is expected.

Looking at the other issue and the Child typed relationship, the same behaviour applies because the type definition is derived from the class public class BaseDbRelationship<T extends BaseNode> itself which is again "just" a BaseNode.

I can guess this behavior, but I think the design is wrong. When I query parent1, SDN should search for Parent1_CONTAIN relationship only.

But I think the root cause is SDN treat the BaseNode as a Node. In my domain, I don't really use the BaseNode, it just a base class for share common field and polymorphism. In real case, I have many childes with same relationship type:

(:Parent1)-[r:Parent1_CONTAIN]->(:Child1)
(:Parent1)-[r:Parent1_CONTAIN]->(:Child2)

So, I have to use a base class to get all of them.

    @Relationship(type = "Parent1_CONTAIN", direction = Relationship.Direction.OUTGOING)
    private List<BaseRelationship<BaseNode>> parent1Relationships;

@meistermeier Is there any other way to achieve this requirement without base class issue?

abccbaandy avatar Aug 21 '24 02:08 abccbaandy

Any update? @michael-simons

abccbaandy avatar Nov 13 '24 16:11 abccbaandy