osv-scanner icon indicating copy to clipboard operation
osv-scanner copied to clipboard

fix: ensure that npm dependencies retain their "production" grouping

Open G-Rath opened this issue 1 year ago • 4 comments

This resolves an inconsistency in the scanner output for npm packages that appear in the same tree multiple times but in different groups; this happens because the table outputter deduplicates on groups on the assumption that a package can only appear in a single group which is incorrect for the NPM ecosystem.

To address this, I've introduced an internal map type that ensures groups are merged when a package is added, with the twist that if either instance of a package being merged is in no groups then that is the result of the merge because implicitly that means an instance of the package is in the "production" group which takes priority over the other groups.

This is something that should most likely be improved on the future, but right now this fix should be good enough to ship since afaik it doesn't impact any other ecosystem and groups are something of a PoC given we can't resolve that information richly enough across all ecosystems yet.

I think this technically could impact pnpm as well, but none of our fixtures gave a good indicator and the latest lockfile version doesn't have that information anymore so frankly I'm not worrying about it at this point.

Resolves #924

G-Rath avatar Apr 23 '24 19:04 G-Rath

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 65.99%. Comparing base (e35bd05) to head (c132a55).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #939      +/-   ##
==========================================
+ Coverage   65.94%   65.99%   +0.05%     
==========================================
  Files         159      159              
  Lines       12758    12778      +20     
==========================================
+ Hits         8413     8433      +20     
  Misses       3885     3885              
  Partials      460      460              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar May 01 '24 00:05 codecov-commenter

Looking a bit into this, there's another awkward edge case with optional prod dependencies & non-optional dev dependencies (devOptional) that we might want to consider.

dev, optional, devOptional: If the package is strictly part of the devDependencies tree, then dev will be true. If it is strictly part of the optionalDependencies tree, then optional will be set. If it is both a dev dependency and an optional dependency of a non-dev dependency, then devOptional will be set. (An optional dependency of a dev dependency will have both dev and optional set.)

https://docs.npmjs.com/cli/v9/configuring-npm/package-lock-json#packages

Our groups parser doesn't seem to get this correct at the moment: https://github.com/google/osv-scanner/blob/97029011c8adaf14f03fa66519836063204bf474/pkg/lockfile/parse-npm-lock.go#L125-L137

So I guess we ought to differentiate between "optional in prod, required in dev" (devOptional) and "not in prod, optional in dev" (dev && optional) somehow, and also apply that to the deduplication logic.

This whole thing kind of comes to some thoughts I've been having on counting vulnerabilities and how or whether we should be deduplicating vulnerabilities in an npm lockfile in the first place. I feel like it'd generally be clearer to have one entry per vulnerability ID, with a complete list of instances of affected packages. Maybe that's a thing to add to the V2 Wishlist.

michaelkedar avatar May 02 '24 00:05 michaelkedar

I don't think I disagree with that but frankly it makes my head spin a bit 😅

Personally I feel like it's probably safest to at least start by just treating groups as part of the uniqueness of a package - I'm pretty sure for most ecosystems that won't change anything because you can only have instance of a package regardless of group, and for ecosystems like NPM it'd mean you would just end up with some more packages in the output.

Then we can work on how to make the output smarter without worrying about the current version of the cli giving false negatives in the meantime

G-Rath avatar May 02 '24 01:05 G-Rath

note to self: consider exploring if an empty string could be useful here - i.e. if a package is not optional or dev, then its ""

(though thinking about it, I think that could just lead back to this same solution - I think either way, its probably correct more as it needs to be handled within the extractor rather than at the outputter level right now)

G-Rath avatar May 02 '24 03:05 G-Rath