pipdeptree icon indicating copy to clipboard operation
pipdeptree copied to clipboard

Add ability to list sizes of dependencies

Open notexactlyawe opened this issue 1 year ago • 2 comments
trafficstars

Describe the feature

Searching for duplicate issues, I found that #108 was closed for lack of a specific use case.

I would like to use pipdeptree to help optimize the size of a pyinstaller executable. It's currently great for seeing what dependencies are present in a project, but it doesn't show the sizes of those dependencies.

Size, to me, is anything that helps me determine the "cost" of a dependency in my final executable. I would like to use it to answer questions like, "Which dependency should I try to remove first to get the size of my executable down?".

A potential metric for this might be wheel size. pip seems to know about this at install time:

...
Collecting virtualenv>=20.25 (from tox->-r .\requirements-dev.txt (line 6))
  Using cached <some pypi>/virtualenv-20.26.3-py3-none-any.whl (5.7 MB)
...

notexactlyawe avatar Jul 08 '24 15:07 notexactlyawe

This has been requested before and I too believe it would be useful.

One approach I see right now is we can:

  1. Try using the RECORD file in the package metadata to determine the package size (see here for details)
  2. If we fail to find it in the metadata directory (since its optional), try to recurse the package's source directory and snatch the file sizes
  3. If we fail to find the package's source directory, we can't determine the package size so consider the size to be "Unknown"

kemzeb avatar Jul 11 '24 15:07 kemzeb

I was playing around and was able to grab the size for the package using the following approach

# in src/pipdeptree/_models/package.py

...
import os
import pkg_resources
import importlib

def get_directory_size(directory) -> int:
    total_size = 0
    for dirpath, dirnames, filenames in os.walk(directory):
        for f in filenames:
            fp = os.path.join(dirpath, f)
            total_size += os.path.getsize(fp)
    return total_size


def get_package_location(package_name) -> Optional[str]:
    try:
        # Try to get the package location using pkg_resources
        package = pkg_resources.get_distribution(package_name)
        return package.location
    except pkg_resources.DistributionNotFound:
        pass

    # Try to get the package location using importlib
    spec = importlib.util.find_spec(package_name)
    if spec is not None:
        return spec.origin

    # If all else fails, return None
    return None

...


class Package(ABC):
    """Abstract class for wrappers around objects that pip returns."""

    UNKNOWN_LICENSE_STR = "(Unknown license)"

    def __init__(self, project_name: str) -> None:
        self.project_name = project_name
        self.key = canonicalize_name(project_name)
        self.location = get_package_location(self.key)
        self.size = get_directory_size(f"{self.location}/{self.key}")

This sets self.size to be the total size in bytes for the package. I don't know how you would approach this, and I wouldn't know how to best render it. Other problems that I would love to have solved is, what is the total memory size of this package and all dependent children? I couldn't figure out how to calculate that

kylepollina avatar Aug 18 '24 05:08 kylepollina