pipdeptree
pipdeptree copied to clipboard
Add ability to list sizes of dependencies
Describe the feature
Searching for duplicate issues, I found that #108 was closed for lack of a specific use case.
I would like to use pipdeptree to help optimize the size of a pyinstaller executable. It's currently great for seeing what dependencies are present in a project, but it doesn't show the sizes of those dependencies.
Size, to me, is anything that helps me determine the "cost" of a dependency in my final executable. I would like to use it to answer questions like, "Which dependency should I try to remove first to get the size of my executable down?".
A potential metric for this might be wheel size. pip seems to know about this at install time:
...
Collecting virtualenv>=20.25 (from tox->-r .\requirements-dev.txt (line 6))
Using cached <some pypi>/virtualenv-20.26.3-py3-none-any.whl (5.7 MB)
...
This has been requested before and I too believe it would be useful.
One approach I see right now is we can:
- Try using the
RECORDfile in the package metadata to determine the package size (see here for details) - If we fail to find it in the metadata directory (since its optional), try to recurse the package's source directory and snatch the file sizes
- If we fail to find the package's source directory, we can't determine the package size so consider the size to be "Unknown"
I was playing around and was able to grab the size for the package using the following approach
# in src/pipdeptree/_models/package.py
...
import os
import pkg_resources
import importlib
def get_directory_size(directory) -> int:
total_size = 0
for dirpath, dirnames, filenames in os.walk(directory):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
def get_package_location(package_name) -> Optional[str]:
try:
# Try to get the package location using pkg_resources
package = pkg_resources.get_distribution(package_name)
return package.location
except pkg_resources.DistributionNotFound:
pass
# Try to get the package location using importlib
spec = importlib.util.find_spec(package_name)
if spec is not None:
return spec.origin
# If all else fails, return None
return None
...
class Package(ABC):
"""Abstract class for wrappers around objects that pip returns."""
UNKNOWN_LICENSE_STR = "(Unknown license)"
def __init__(self, project_name: str) -> None:
self.project_name = project_name
self.key = canonicalize_name(project_name)
self.location = get_package_location(self.key)
self.size = get_directory_size(f"{self.location}/{self.key}")
This sets self.size to be the total size in bytes for the package. I don't know how you would approach this, and I wouldn't know how to best render it. Other problems that I would love to have solved is, what is the total memory size of this package and all dependent children? I couldn't figure out how to calculate that