pydeps icon indicating copy to clipboard operation
pydeps copied to clipboard

Show only internal dependencies

Open isaac-blanc opened this issue 6 months ago • 6 comments

Question

How do I show only the internal dependencies in my project?

Example

Say that my project file structure looks like this:

package/
|-- subpackage_1/
|   |-- module_1.py
|   |-- module_2.py
|-- subpackage_2/
|   |-- module_3.py
|-- main.py

I would like a diagram that shows that main.py depends on module_1.py, which in turn depends on module_2.py. However, I do not want the diagram to show that module_2.py depends on numpy and pandas, since those are external dependencies. In fact, I don't want to see numpy and pandas anywhere on the diagram.

Similar

#217 mentions the -x and -xx flags for exlcluding dependencies. Maybe I can exclude all dependencies in my .venv folder?

Thanks in advance!

isaac-blanc avatar Jun 19 '25 15:06 isaac-blanc

The easiest way is probably to use --max-bacon=1 which only includes modules one hop away (pydeps creates a dummy module that includes all files under the target directory so all your files are one hop away).

thebjorn avatar Jun 22 '25 18:06 thebjorn

Thank you for the reply, and sorry for my own slow response.

Your solution

I applied your solution to the demo project attached to this message, which matches the structure I described above. I used pydeps .\main.py --max-bacon=1. It produced the following diagram. This shows that main.py depends on module_1 and module_3, but does not show that module_1 depends on module_2. I think this is because module_2 is more than 1 hop away from main.py although I'm wary that this contradicts what you say about the "dummy" module. Have I misunderstood?

testing_pydeps.zip

Image

I can of course resolve this by using pydeps .\main.py --max-bacon=2. The result is correct... for now.

Image

But now supposing module_3 depends on numpy, I get this diagram that shows that external dependency even though it is not my code.

Image

It strikes me that distinguishing between internal and external dependencies by number of hops is a bad choice, because external dependencies can be any number of hops away. However, I think I may have misunderstood your reply.

Alternative solution

This is the workaround I am currently using. I would be curious to know your thoughts.

pydeps .\main.py --only main subpackage_1 subpackage_2 gives the following result, which is correct. It is a little tedious to have to list each subpackage for --only, but it is certainly less tedious than listing each external dependency for --exclude

Image

Huh?

I thought this would be a standard useage of pydeps, and therefore straightforward. Have I structured my project weirdly? Or am I mistaken in thinking that other people would like to graph the internal structure of their code like this? Not meaning to complain; just wondering if I have missed something basic?

Extra question

It seems a little redundant for the diagram to show both subpackages and modules as dependencies. Is there a way to show just the modules on the diagram? I tried using clusters as shown below, but the subpackages are still shown as nodes, as well as clusters. Specifically, I ran pydeps .\main.py --only main subpackage_1 subpackage_2 --cluster --max-cluster-size=100

Image

isaac-blanc avatar Jul 02 '25 23:07 isaac-blanc

I'll be AFK for a little while dealing with a death in the family. I'll get back with you once I've had some time to process.

thebjorn avatar Jul 04 '25 03:07 thebjorn

Very sorry to hear that. Take your time; no rush at all.

isaac-blanc avatar Jul 04 '25 17:07 isaac-blanc

Did you get a chance to look at this?

isaac-blanc avatar Nov 10 '25 10:11 isaac-blanc

Hi Isaac, thanks for reminding me.

There is nothing currently in pydeps that will let you filter (include/exclude) based on the directory path of the module. Python makes little distinction between internal/external code, and for a small subset of modules this might not be possible, but I'll be happy to merge a PR that implements such functionality for cases such as this (under --include-path and --exclude-path), or even a more specific --internal-only flag (this does seem like an interesting use-case).

Looking at your particular example there are a couple of ways that may make pydeps usage more manageable...

Let's say your package is located in the "code" directory. If you add an __init__.py file to the package:

code/
|-- package/
   |-- subpackage_1/
   |   |-- module_1.py
   |   |-- module_2.py
   |-- subpackage_2/
   |   |-- module_3.py
   |-- main.py
   |-- __init__.py

You should be able to, from the code directory, do:

$../code> pydeps package --only=package

(this depends on the implicit module path that Python sees always starting with package. for everythin in that directory)

The default clustering rules should be more usable too:

$../code> pydeps package --cluster

I think that should collapse all clusters, but based on your comment above there might be a magic number 4 in play (i.e. clusters with <= 4 modules are not collapsed). You can force it to always collapse with

$../code> pydeps package --cluster --min-cluster-size=1

If you just want to automate the collection of --only parameters you could write a .bat file, it is a bit more tedious on windows, but maybe something like:

@echo off
setlocal enabledelayedexpansion

rem ----- Collect module names -----
set "mods="

for /d %%D in (*) do (
    if exist "%%D\__init__.py" (
        if defined mods (
            set "mods=!mods! %%D"
        ) else (
            set "mods=%%D"
        )
    )
)

rem ----- Run pydeps -----
echo Running: pydeps main.py --only %mods%
pydeps main.py --only %mods%

endlocal

thebjorn avatar Nov 18 '25 12:11 thebjorn