NuGetGallery icon indicating copy to clipboard operation
NuGetGallery copied to clipboard

[Feature]: Show aggregate information about package dependencies

Open lostmsu opened this issue 1 year ago • 5 comments

Related Problem

The well-known problem of NPM package bloat.

The Elevator Pitch

When I am looking for packages, information like download size of the package itself is insufficient to make an informed decision.

It would be nice to have the following information on the main package page:

  • total size of all dependencies
  • number of transitive dependencies

Additional Context and Details

No response

lostmsu avatar Jul 22 '24 17:07 lostmsu

I like this idea. One issue with calculating these metrics (total size on disk, number of nodes in the graph) is that transitive dependencies of a package vary per target framework.

Said another way, these numbers will be different if you're restoring for .NET Framework 4.7.2 vs. .NET 8.0. So either we'd need to pick the one target framework to show this number for or show a useful variety.

Generally, NuGet.org has avoided simulating a package restore operations (i.e. exploring the transitive dependencies) and just looks at each package individually. We'd need some careful controls and handling of edge cases. For example it's possible to have a package on NuGet.org whose transitive dependency is NOT on nuget.org (not yet, due to timing of package publishing, or it will never be there due to a publisher error or by design -- some other feed).

For "total size of all dependencies", this could have several interpretations. Do you mean the cumulative size of the .nupkg per transitive dependency? Or the extracted size on disk after restore? Or perhaps just the package assets that end up in your bin directory?

joelverhagen avatar Jul 25 '24 19:07 joelverhagen

Either size of .nupkg as-is, or its extracted size.

lostmsu avatar Jul 25 '24 21:07 lostmsu

One problem I see is graph is not deterministic it depends on available transitive dependencies and tfm version, they're always in motion. Sometimes new versions are published and old ones are unlisted. Also same dependencies are shared across many direct dependencies, so it's hard to give accurate number on size.

erdembayar avatar Jul 25 '24 21:07 erdembayar

@lostmsu Is npm results are deterministic ? Does it change over time?

erdembayar avatar Jul 25 '24 21:07 erdembayar

I am not sure what is the best approach for handling TFMs, but just want to remind that perfect is the enemy of good. E.g. it does not have to be exact. So I'd propose to start with:

long Size(Pkg pkg) => pkg.NuPkgFile.Size;

Set Dependencies(Pkg pkg) {
  var deps = new Set();
  foreach (var tfm in pkg.TFMs) {
    var tfmDeps = pkg.Dependencies[tfm];
    foreach (Pkg dep in tfmDeps)
      if (deps.Add(dep)) deps.Union(Dependencies(dep));
  }
  return deps;
}

long TotalSize(Pkg pkg)
  => Size(pkg) + Dependencies(pkg).Select(Size).Sum();

long DepCount(Pkg pkg) => Dependencies(pkg).Count;

Only calculate when the package version is uploaded and don't update later.

UPD. one thing missing above is kicking out old versions from the deps set when a newer one is requested.

lostmsu avatar Jul 25 '24 21:07 lostmsu