ort
ort copied to clipboard
Rewrite PNPM analyzer to use (generated) lockfiles
I'm writing this to document an idea I've already discussed with @fviernau.
I believe that a major bottleneck of (initial) analyzer runs on Node projects in general is the fact that we need to run (p)npm install which downloads all binary artifacts. However, recent (P)NPM versions learned the --lockfile-only option which at least safes the file I/O to write the node_modules directory, but ideally also avoids the download of binary artifacts.
While many lockfiles have the problem that they do not provide information about the hierarchy of dependencies, the PNPM lockfile format version 9 does provide that information.
So the idea is to rewrite the PNPM analyzer to parse the lockfile instead of running pnpm list, which has several problems. If no lockfile is already present, a temporary one would be created on-the-fly via pnpm install --lockfile-only, and then continuing with the usual processing. The latter needs to do the usual (p)npm info / view calls to enrich metadata of packages with information missing from the lockfile.
I just discovered https://lfx.rushstack.io/, BTW, which looks like a useful tool to analyze / debug PNPM lockfile issues.
I think the pnpm info calls should also be parallelized akin to what was done with npm here https://github.com/oss-review-toolkit/ort/pull/10059 .