pydeps icon indicating copy to clipboard operation
pydeps copied to clipboard

Duplicate source entries got generated that caused incorrect call graph relationship.

Open happyhebaby opened this issue 2 years ago • 4 comments

Greetings again.

I have a test project that aim to unit test the tool integration, test and source files all live in one directory, a simplified version: inheritance_copy: init.py animal.py dog.py test_dog.py

source code

animal.py:
class Animal:
    def __init__(self, name, species):
        self.name = name
        self.species = species
    
    def make_sound(self):
        return "The animal makes a sound."

dog.py:
import animal

class Dog(animal.Animal):
    def __init__(self, name, breed):
        super().__init__(name, "Canis lupus familiaris")
        self.breed = breed
    
    def make_sound(self):
        return "Woof!"
     
test_dog.py:
import dog

def test_dog():
	a_dog = dog.Dog("Max", "Labrador Retriever")
	assert a_dog.make_sound() == "Woof!"

When I run cmd under the root inheritance_copy:

pydeps --show-deps --no-output --deps-output b.json --include-missing .

It generated the following json:

{
    "__main__": {
        "bacon": 0,
        "imports": [
            "inheritance_copy",
            "inheritance_copy.animal",
            "inheritance_copy.dog",
            "inheritance_copy.test_dog"
        ],
        "name": "__main__",
        "path": null
    },
    "animal": {
        "bacon": 2,
        "imported_by": [
            "inheritance_copy.dog"
        ],
        "name": "animal",
        "path": null
    },
    "dog": {
        "bacon": 2,
        "imported_by": [
            "inheritance_copy.test_dog"
        ],
        "name": "dog",
        "path": null
    },
    "inheritance_copy": {
        "bacon": 1,
        "imported_by": [
            "__main__"
        ],
        "name": "inheritance_copy",
        "path": "<mylocalpath>/inheritance_copy/__init__.py"
    },
    "inheritance_copy.animal": {
        "bacon": 1,
        "imported_by": [
            "__main__"
        ],
        "name": "inheritance_copy.animal",
        "path": "<mylocalpath>inheritance_copy/animal.py"
    },
    "inheritance_copy.dog": {
        "bacon": 1,
        "imported_by": [
            "__main__"
        ],
        "imports": [
            "animal"
        ],
        "name": "inheritance_copy.dog",
        "path": "<mylocalpath>/inheritance_copy/dog.py"
    },
    "inheritance_copy.test_dog": {
        "bacon": 1,
        "imported_by": [
            "__main__"
        ],
        "imports": [
            "dog"
        ],
        "name": "inheritance_copy.test_dog",
        "path": "<mylocalpath>/inheritance_copy/test_dog.py"
    }
}

It seems to me the above relationship has two issues: It generated two duplicate entry pairs such as dog and inheritance_copy.dog; More seriously it referred the wrong entry(from a pair) in the relationship, one example: inheritance_copy.test_dog imports dog, which has a null path, it should imports 'inheritance_copy.dog, which has the correct source path. BTW if I move all the test files to subdir test` then the duplicate pairs will not be generated and the call graph is correct.

happyhebaby avatar Apr 01 '23 01:04 happyhebaby

I tested further, even in a case when I have the above structure where non-test source lives in root, test source lives in test sub folder, it reproed the same bug.

happyhebaby avatar Apr 01 '23 02:04 happyhebaby

@thebjorn Just wonder if you have time to look at this issue? It brings wrong result to our processing.

Also another slightly variation of this issue: If I run the following cmd for this dir: https://github.com/pytest-dev/pytest/tree/main/src

pydeps --show-deps --no-output --include-missing -LERROR .

It generate two different entries with different import info but pointing to the same file, this also brought incorrect result to our processing:

First entry:
"src._pytest._io": {
    "bacon": 1,
    "imported_by": [
      "__main__"
    ],
    "imports": [
      "src._pytest._io.terminalwriter"
    ],
    "name": "src._pytest._io",
    "path": "<mylocalpath>/pytest/src/_pytest/_io/__init__.py"
  }
  
Second entry:
  "_pytest._io": {
    "bacon": 1,
    "imported_by": [
      "__main__",
      "_pytest._code.code",
      "_pytest.assertion.rewrite",
      "_pytest.assertion.util",
      "_pytest.cacheprovider",
      "_pytest.compat",
      "_pytest.config",
      "_pytest.config.argparsing",
      "_pytest.doctest",
      "_pytest.fixtures",
      "_pytest.logging",
      "_pytest.python",
      "_pytest.reports",
      "_pytest.setuponly",
      "_pytest.terminal"
    ],
    "imports": [
      "_pytest._io.terminalwriter"
    ],
    "name": "_pytest._io",
    "path": "<mylocalpath>/pytest/src/_pytest/_io/__init__.py"
  }

happyhebaby avatar Apr 06 '23 03:04 happyhebaby

Hi @happyhebaby ! I'm on Easter vacation right now so I don't have much bandwidth (physically and mentally ;-) )

I see the files have different names, _pytest._io and src._pytest._io, and looking at the directory I can understand why (src is not a "code-root").

You'll probably get what you want with

pydeps --show-deps --no-output --include-missing -LERROR pytest

when standing in the src directory.

I can see if there is a way to merge nodes based on path, but I'm not sure I'll get to it until next week...

thebjorn avatar Apr 08 '23 21:04 thebjorn

@thebjorn I got the above result when running from src directory. Preferably if the tool could generate complete call graph recursively for all current dir and sub dir by default, but it doesn't do so at present. so we have to process all folders that have source files individually and then merge the result.

happyhebaby avatar Apr 09 '23 17:04 happyhebaby