pylint icon indicating copy to clipboard operation
pylint copied to clipboard

[pyreverse] Fix duplicate arrows when class attribute is assigned more than once

Open pranav-data opened this issue 7 months ago • 9 comments

Type of Changes

Type
:bug: Bug fix
:sparkles: New feature
:hammer: Refactoring
:scroll: Docs

Description

Postprocessing relationships and classes to find replication across relationship types / attributes.

Output for example:

classDiagram
  class A {
    var : int
  }
  class B {
    a_obj
    func()
  }
  A --* B : a_obj

Closes #9267 Closes: #8046 Fixes closed issue: #8189

pranav-data avatar Apr 09 '25 20:04 pranav-data

Codecov Report

Attention: Patch coverage is 76.92308% with 9 lines in your changes missing coverage. Please review.

Project coverage is 95.85%. Comparing base (6faf5b4) to head (608bbc0). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pylint/pyreverse/diadefslib.py 76.92% 9 Missing :warning:

:x: Your patch check has failed because the patch coverage (76.92%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #10333      +/-   ##
==========================================
- Coverage   95.90%   95.85%   -0.05%     
==========================================
  Files         176      176              
  Lines       19108    19145      +37     
==========================================
+ Hits        18325    18352      +27     
- Misses        783      793      +10     
Files with missing lines Coverage Δ
pylint/pyreverse/diadefslib.py 94.47% <76.92%> (-4.84%) :arrow_down:

... and 1 file with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Apr 09 '25 20:04 codecov[bot]

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit cdcdbbf3d1845de2a30d6054f328db04a20710bc

github-actions[bot] avatar Apr 09 '25 21:04 github-actions[bot]

Hi @pranav-data, thanks for the PR! 🙂 Can you remove the additions to the .gitignore and the pyenv.cfg file? These are custom to your own environment and should not be added to the repo.

DudeNr33 avatar Apr 10 '25 18:04 DudeNr33

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit 0d89cc973d975e342d6149f03f72ac7a878c88d8

github-actions[bot] avatar Apr 10 '25 18:04 github-actions[bot]

@DudeNr33 I have removed the files and pushed.

pranav-data avatar Apr 10 '25 22:04 pranav-data

I have been noticing duplicate class definitions captured, in a project I am working on, but unable to reproduce it in a simpler example for a unit test. The parts where code coverage is lacking is addressing that issue.

pranav-data avatar Apr 10 '25 22:04 pranav-data

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit 608bbc0985a6e70bcc39f858af94a234beb4458f

github-actions[bot] avatar Apr 12 '25 12:04 github-actions[bot]

Removing the duplicate from diagrams.py could clean up the classes but messes up the relationships. From my debugging, it appears that the duplicate originates from how astroid parses the repository for dependencies, where one copy of the class is required to establish the relationship.

Breaking it down to 2 separate fixes will fix one while breaking the other.

pranav-data avatar Apr 14 '25 16:04 pranav-data

I do not see yet how moving the logic to diagrams.py would be a problem, as all the post-processing happens on instance attributes of the ClassDiagram objects.

I tried it locally and the test still passes, expand to see the diff of my changes:

diff --git a/pylint/pyreverse/diadefslib.py b/pylint/pyreverse/diadefslib.py
index f91a89eca..d3b045437 100644
--- a/pylint/pyreverse/diadefslib.py
+++ b/pylint/pyreverse/diadefslib.py
@@ -68,10 +68,7 @@ class DiaDefGenerator:
         leaf_nodes = [
             module
             for module in self.args
-            if not any(
-                other != module and other.startswith(module + ".")
-                for other in self.args
-            )
+            if not any(other != module and other.startswith(module + ".") for other in self.args)
         ]
         return leaf_nodes
 
@@ -120,12 +117,8 @@ class DiaDefGenerator:
         absolute_depth = name.count(".")
 
         # Retrieve the base depth to compare against
-        relative_depth = next(
-            (v for k, v in self.args_depths.items() if name.startswith(k)), None
-        )
-        return relative_depth is not None and bool(
-            (absolute_depth - relative_depth) <= self.config.max_depth
-        )
+        relative_depth = next((v for k, v in self.args_depths.items() if name.startswith(k)), None)
+        return relative_depth is not None and bool((absolute_depth - relative_depth) <= self.config.max_depth)
 
     def show_node(self, node: nodes.ClassDef) -> bool:
         """Determine if node should be shown based on config."""
@@ -143,9 +136,7 @@ class DiaDefGenerator:
         self.linker.visit(node)
         self.classdiagram.add_object(self.get_title(node), node)
 
-    def get_ancestors(
-        self, node: nodes.ClassDef, level: int
-    ) -> Generator[nodes.ClassDef]:
+    def get_ancestors(self, node: nodes.ClassDef, level: int) -> Generator[nodes.ClassDef]:
         """Return ancestor nodes of a class node."""
         if level == 0:
             return
@@ -154,9 +145,7 @@ class DiaDefGenerator:
                 continue
             yield ancestor
 
-    def get_associated(
-        self, klass_node: nodes.ClassDef, level: int
-    ) -> Generator[nodes.ClassDef]:
+    def get_associated(self, klass_node: nodes.ClassDef, level: int) -> Generator[nodes.ClassDef]:
         """Return associated nodes of a class node."""
         if level == 0:
             return
@@ -170,9 +159,7 @@ class DiaDefGenerator:
                     continue
                 yield node
 
-    def extract_classes(
-        self, klass_node: nodes.ClassDef, anc_level: int, association_level: int
-    ) -> None:
+    def extract_classes(self, klass_node: nodes.ClassDef, anc_level: int, association_level: int) -> None:
         """Extract recursively classes related to klass_node."""
         if self.classdiagram.has_node(klass_node) or not self.show_node(klass_node):
             return
@@ -203,9 +190,7 @@ class DefaultDiadefGenerator(LocalsVisitor, DiaDefGenerator):
         """
         mode = self.config.mode
         if len(node.modules) > 1:
-            self.pkgdiagram: PackageDiagram | None = PackageDiagram(
-                f"packages {node.name}", mode
-            )
+            self.pkgdiagram: PackageDiagram | None = PackageDiagram(f"packages {node.name}", mode)
         else:
             self.pkgdiagram = None
         self.classdiagram = ClassDiagram(f"classes {node.name}", mode)
@@ -273,99 +258,6 @@ class DiadefsHandler:
         self.config = config
         self.args = args
 
-    def _get_object_name(self, obj: Any) -> str:
-        """Get object name safely handling both title attributes and strings.
-
-        :param obj: Object to get name from
-        :return: Object name/title
-        :rtype: str
-        """
-        return obj.title if hasattr(obj, "title") else str(obj)
-
-    def _process_relationship(
-        self, relationship: Relationship, unique_classes: dict[str, Any], obj: Any
-    ) -> None:
-        """Process a single relationship for deduplication.
-
-        :param relationship: Relationship to process
-        :param unique_classes: Dict of unique classes
-        :param obj: Current object being processed
-        """
-        if relationship.from_object == obj:
-            relationship.from_object = unique_classes[obj.node.qname()]
-        if relationship.to_object == obj:
-            relationship.to_object = unique_classes[obj.node.qname()]
-
-    def _process_class_relationships(
-        self, diagram: ClassDiagram, obj: Any, unique_classes: dict[str, Any]
-    ) -> None:
-        """Merge relationships for a class.
-
-        :param diagram: Current diagram
-        :param obj: Object whose relationships to process
-        :param unique_classes: Dict of unique classes
-        """
-        for rel_type in ("specialization", "association", "aggregation"):
-            for rel in diagram.get_relationships(rel_type):
-                self._process_relationship(rel, unique_classes, obj)
-
-    def deduplicate_classes(self, diagrams: list[ClassDiagram]) -> list[ClassDiagram]:
-        """Remove duplicate classes from diagrams."""
-        for diagram in diagrams:
-            # Track unique classes by qualified name
-            unique_classes: dict[str, Any] = {}
-            duplicate_classes: Any = set()
-
-            # First pass - identify duplicates
-            for obj in diagram.objects:
-                qname = obj.node.qname()
-                if qname in unique_classes:
-                    duplicate_classes.add(obj)
-                    self._process_class_relationships(diagram, obj, unique_classes)
-                else:
-                    unique_classes[qname] = obj
-
-            # Second pass - filter out duplicates
-            diagram.objects = [
-                obj for obj in diagram.objects if obj not in duplicate_classes
-            ]
-
-        return diagrams
-
-    def _process_relationship_type(
-        self,
-        rel_list: list[Relationship],
-        seen: set[tuple[str, str, str, Any | None]],
-        unique_rels: dict[str, list[Relationship]],
-        rel_name: str,
-    ) -> None:
-        """Process a list of relationships of a single type.
-
-        :param rel_list: List of relationships to process
-        :param seen: Set of seen relationships
-        :param unique_rels: Dict to store unique relationships
-        :param rel_name: Name of relationship type
-        """
-        for rel in rel_list:
-            key = (
-                self._get_object_name(rel.from_object),
-                self._get_object_name(rel.to_object),
-                type(rel).__name__,
-                getattr(rel, "name", None),
-            )
-            if key not in seen:
-                seen.add(key)
-                unique_rels[rel_name].append(rel)
-
-    def deduplicate_relationships(self, diagram: ClassDiagram) -> None:
-        """Remove duplicate relationships between objects."""
-        seen: set[tuple[str, str, str, Any | None]] = set()
-        unique_rels: dict[str, list[Relationship]] = defaultdict(list)
-        for rel_name, rel_list in diagram.relationships.items():
-            self._process_relationship_type(rel_list, seen, unique_rels, rel_name)
-
-        diagram.relationships = dict(unique_rels)
-
     def get_diadefs(self, project: Project, linker: Linker) -> list[ClassDiagram]:
         """Get the diagram's configuration data.
 
@@ -386,5 +278,4 @@ class DiadefsHandler:
             diagrams = DefaultDiadefGenerator(linker, self).visit(project)
         for diagram in diagrams:
             diagram.extract_relationships()
-            self.deduplicate_relationships(diagram)
-        return self.deduplicate_classes(diagrams)
+        return diagrams
diff --git a/pylint/pyreverse/diagrams.py b/pylint/pyreverse/diagrams.py
index 6db056300..73ec71ba0 100644
--- a/pylint/pyreverse/diagrams.py
+++ b/pylint/pyreverse/diagrams.py
@@ -6,6 +6,7 @@
 
 from __future__ import annotations
 
+from collections import defaultdict
 from collections.abc import Iterable
 from typing import Any
 
@@ -45,9 +46,7 @@ class DiagramEntity(Figure):
 
     default_shape = ""
 
-    def __init__(
-        self, title: str = "No name", node: nodes.NodeNG | None = None
-    ) -> None:
+    def __init__(self, title: str = "No name", node: nodes.NodeNG | None = None) -> None:
         super().__init__()
         self.title = title
         self.node: nodes.NodeNG = node or nodes.NodeNG(
@@ -109,9 +108,7 @@ class ClassDiagram(Figure, FilterMixIn):
         rel = Relationship(from_object, to_object, relation_type, name)
         self.relationships.setdefault(relation_type, []).append(rel)
 
-    def get_relationship(
-        self, from_object: DiagramEntity, relation_type: str
-    ) -> Relationship:
+    def get_relationship(self, from_object: DiagramEntity, relation_type: str) -> Relationship:
         """Return a relationship or None."""
         for rel in self.relationships.get(relation_type, ()):
             if rel.from_object is from_object:
@@ -126,14 +123,11 @@ class ClassDiagram(Figure, FilterMixIn):
         properties = {
             local_name: local_node
             for local_name, local_node in node.items()
-            if isinstance(local_node, nodes.FunctionDef)
-            and decorated_with_property(local_node)
+            if isinstance(local_node, nodes.FunctionDef) and decorated_with_property(local_node)
         }
 
         # Add instance attributes to properties
-        for attr_name, attr_type in list(node.locals_type.items()) + list(
-            node.instance_attrs_type.items()
-        ):
+        for attr_name, attr_type in list(node.locals_type.items()) + list(node.instance_attrs_type.items()):
             if attr_name not in properties:
                 properties[attr_name] = attr_type
 
@@ -142,9 +136,7 @@ class ClassDiagram(Figure, FilterMixIn):
                 continue
 
             # Handle property methods differently to correctly extract return type
-            if isinstance(
-                associated_nodes, nodes.FunctionDef
-            ) and decorated_with_property(associated_nodes):
+            if isinstance(associated_nodes, nodes.FunctionDef) and decorated_with_property(associated_nodes):
                 if associated_nodes.returns:
                     type_annotation = get_annotation_label(associated_nodes.returns)
                     node_name = f"{node_name} : {type_annotation}"
@@ -184,9 +176,7 @@ class ClassDiagram(Figure, FilterMixIn):
             if isinstance(node, astroid.Instance):
                 node = node._proxied
             if (
-                isinstance(
-                    node, (nodes.ClassDef, nodes.Name, nodes.Subscript, nodes.BinOp)
-                )
+                isinstance(node, (nodes.ClassDef, nodes.Name, nodes.Subscript, nodes.BinOp))
                 and hasattr(node, "name")
                 and not self.has_node(node)
             ):
@@ -194,11 +184,7 @@ class ClassDiagram(Figure, FilterMixIn):
                     node_name = node.name
                     names.append(node_name)
         # sorted to get predictable (hence testable) results
-        return sorted(
-            name
-            for name in names
-            if all(name not in other or name == other for other in names)
-        )
+        return sorted(name for name in names if all(name not in other or name == other for other in names))
 
     def has_node(self, node: nodes.NodeNG) -> bool:
         """Return true if the given node is included in the diagram."""
@@ -237,9 +223,7 @@ class ClassDiagram(Figure, FilterMixIn):
             # associations & aggregations links
             for name, values in list(node.aggregations_type.items()):
                 for value in values:
-                    self.assign_association_relationship(
-                        value, obj, name, "aggregation"
-                    )
+                    self.assign_association_relationship(value, obj, name, "aggregation")
 
             associations = node.associations_type.copy()
 
@@ -249,9 +233,9 @@ class ClassDiagram(Figure, FilterMixIn):
 
             for name, values in associations.items():
                 for value in values:
-                    self.assign_association_relationship(
-                        value, obj, name, "association"
-                    )
+                    self.assign_association_relationship(value, obj, name, "association")
+        self.deduplicate_relationships()
+        self.deduplicate_classes()
 
     def assign_association_relationship(
         self, value: astroid.NodeNG, obj: ClassEntity, name: str, type_relationship: str
@@ -266,6 +250,92 @@ class ClassDiagram(Figure, FilterMixIn):
         except KeyError:
             return
 
+    def _get_object_name(self, obj: Any) -> str:
+        """Get object name safely handling both title attributes and strings.
+
+        :param obj: Object to get name from
+        :return: Object name/title
+        :rtype: str
+        """
+        return obj.title if hasattr(obj, "title") else str(obj)
+
+    def _process_relationship(self, relationship: Relationship, unique_classes: dict[str, Any], obj: Any) -> None:
+        """Process a single relationship for deduplication.
+
+        :param relationship: Relationship to process
+        :param unique_classes: Dict of unique classes
+        :param obj: Current object being processed
+        """
+        if relationship.from_object == obj:
+            relationship.from_object = unique_classes[obj.node.qname()]
+        if relationship.to_object == obj:
+            relationship.to_object = unique_classes[obj.node.qname()]
+
+    def _process_class_relationships(
+        self, diagram: ClassDiagram, obj: Any, unique_classes: dict[str, Any]
+    ) -> None:
+        """Merge relationships for a class.
+
+        :param diagram: Current diagram
+        :param obj: Object whose relationships to process
+        :param unique_classes: Dict of unique classes
+        """
+        for rel_type in ("specialization", "association", "aggregation"):
+            for rel in diagram.get_relationships(rel_type):
+                self._process_relationship(rel, unique_classes, obj)
+
+    def deduplicate_classes(self) -> None:
+        """Remove duplicate classes from diagrams."""
+        # Track unique classes by qualified name
+        unique_classes: dict[str, Any] = {}
+        duplicate_classes: Any = set()
+
+        # First pass - identify duplicates
+        for obj in self.objects:
+            qname = obj.node.qname()
+            if qname in unique_classes:
+                duplicate_classes.add(obj)
+                self._process_class_relationships(self, obj, unique_classes)
+            else:
+                unique_classes[qname] = obj
+
+        # Second pass - filter out duplicates
+        self.objects = [obj for obj in self.objects if obj not in duplicate_classes]
+
+    def _process_relationship_type(
+        self,
+        rel_list: list[Relationship],
+        seen: set[tuple[str, str, str, Any | None]],
+        unique_rels: dict[str, list[Relationship]],
+        rel_name: str,
+    ) -> None:
+        """Process a list of relationships of a single type.
+
+        :param rel_list: List of relationships to process
+        :param seen: Set of seen relationships
+        :param unique_rels: Dict to store unique relationships
+        :param rel_name: Name of relationship type
+        """
+        for rel in rel_list:
+            key = (
+                self._get_object_name(rel.from_object),
+                self._get_object_name(rel.to_object),
+                type(rel).__name__,
+                getattr(rel, "name", None),
+            )
+            if key not in seen:
+                seen.add(key)
+                unique_rels[rel_name].append(rel)
+
+    def deduplicate_relationships(self) -> None:
+        """Remove duplicate relationships between objects."""
+        seen: set[tuple[str, str, str, Any | None]] = set()
+        unique_rels: dict[str, list[Relationship]] = defaultdict(list)
+        for rel_name, rel_list in self.relationships.items():
+            self._process_relationship_type(rel_list, seen, unique_rels, rel_name)
+
+        self.relationships = dict(unique_rels)
+
 
 class PackageDiagram(ClassDiagram):
     """Package diagram handling."""

There is also no problem when removing the call to deduplicate_classes.

Maybe there was some misunderstanding, so can you please check the proposed diff and comment if your concerns about moving the logic and splitting the MR are still valid?

DudeNr33 avatar Apr 14 '25 18:04 DudeNr33