osv-scanner icon indicating copy to clipboard operation
osv-scanner copied to clipboard

feat: replace `requirementsenhanceable` extractor with transitive enricher

Open G-Rath opened this issue 1 month ago • 11 comments

On the surface this was pretty much a drop-in replacement which is really exciting, but there are a couple of quirks that we may or may not want to deal with before landing this.

The first one is very obvious which is that our "found packages" count is very different - it's technically correct, but I think ideally it would be good if we could express that we found x more packages via the enricher.

The second one is more interesting and I suspect one we'll probably ignore for the time being, but right now scalibr implicitly adds extractors that are required by enrichers as part of doing the scan which is at odds with our (experimental) abilities to enable and disable plugins - for now I've made it so that the enricher is only enabled if the requirements extractor is enabled, since we don't have that many enrichers with plugin requirements right now.

Resolves #2289

G-Rath avatar Oct 22 '25 01:10 G-Rath

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 67.80%. Comparing base (12ad8d7) to head (c59dec1).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2294      +/-   ##
==========================================
- Coverage   67.81%   67.80%   -0.01%     
==========================================
  Files         172      171       -1     
  Lines       13026    13005      -21     
==========================================
- Hits         8833     8818      -15     
+ Misses       3505     3502       -3     
+ Partials      688      685       -3     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov-commenter avatar Oct 22 '25 02:10 codecov-commenter

@cuixq fwiw it seems like the enricher might be flakier than the existing extractor? it doesn't seem to be too common and might be a third-party thing, but afaik we've never had any flakey issues with the existing extractor.

See this run for an example:

--- FAIL: TestCommand (0.00s)
    --- FAIL: TestCommand/requirements.txt_can_have_all_kinds_of_names (6.94s)
        command_test.go:331: 
            - Snapshot - 2
            + Received + 2
            
            @@ -6,8 +6,8 @@
            
              Scanned <rootdir>/testdata/locks-requirements/requirements.txt file and found 3 packages
              Scanned <rootdir>/testdata/locks-requirements/the_requirements_for_test.txt file and found 1 package
              Scanned <rootdir>/testdata/locks-requirements/unresolvable-requirements.txt file and found 3 packages
            - failed resolution: no file can be used for parsing requirements for package flask-cors version 1.0
            - failed to parse metadata for file Flask-Cors-1.0.tar.gz: sdist: dependencies in setup.py, not in PKG-INFO
            + failed resolution: resolution impossible:
            + no candidates at all for: pytz ">=2011k"
              Total 15 packages affected by 53 known vulnerabilities (2 Critical, 20 High, 28 Medium, 0 Low, 3 Unknown) from 1 ecosystem.
              53 vulnerabilities can be fixed.
              ↵
            
            at __snapshots__\command_test.snap:1349

G-Rath avatar Oct 23 '25 21:10 G-Rath

this error seems not like a flaky error: both resolution failure should be deterministic if the artifacts don't change - not sure if related to the artifacts from registry. and it only failed for windows :thinking:

maybe don't worry too much about it now - see if this is going to happen again.

cuixq avatar Oct 23 '25 22:10 cuixq

yeah at this point I think we should just keep an eye on it, but fwiw its not just a Windows thing.

G-Rath avatar Oct 23 '25 22:10 G-Rath

Huh reran this and it failed again immediately.

another-rex avatar Oct 29 '25 05:10 another-rex

@cuixq Is there some retry logic that's missing in the osv-scalibr impl?

another-rex avatar Oct 29 '25 05:10 another-rex

I missed the comments from last week - I will take a look later.

cuixq avatar Nov 02 '25 23:11 cuixq

I think this is probably due to the order of packages is not deterministic after grouping them into a map: https://github.com/google/osv-scalibr/blob/main/enricher/transitivedependency/requirements/requirements.go#L86

this should be easy to fix since we already record the index in the map values.

cuixq avatar Nov 12 '25 23:11 cuixq

however I think pip's backtracking dependency resolution may not be affected by the order of dependencies, but this is a thing to follow up with the resolver instead of scalibr.

For now I will make the change to keep the order of the packages as they are extracted and see if this helps with the flaky test results.

cuixq avatar Nov 12 '25 23:11 cuixq

@G-Rath sorry I merged in the wrong snapshot you may need to re-generate them and push again.

cuixq avatar Nov 12 '25 23:11 cuixq

https://github.com/google/osv-scalibr/pull/1532 is merged so I hope the flaky test should be fixed.

We probably still want to wait until https://github.com/google/osv-scanner/pull/2328 is in so we can update osv-scalibr further.

cuixq avatar Nov 13 '25 00:11 cuixq

@G-Rath could you help to merge the main branch add see if https://github.com/google/osv-scalibr/pull/1532 fixes the flaky test?

cuixq avatar Nov 16 '25 23:11 cuixq

Rerunning again just to make sure, otherwise good to merge.

another-rex avatar Nov 17 '25 01:11 another-rex