scancode.io icon indicating copy to clipboard operation
scancode.io copied to clipboard

ScanCode.io: Analyze 1000 source and binary packages with d2d pipeline

Open pombredanne opened this issue 10 months ago • 2 comments

I would like to analyze roughly 1000 source and binary packages with d2d pipeline to evaluate how it performs. Some packages to consider could include xz-utils/liblzma, Apache httrace, and a slice of popular packages.

We should have a report on this to integrate in the documentation and as a blog post, including any possible CVEs/discrepancy detected in the process.

The run of a 1000 packages will need either:

  • to be scripted based on a list of to and from and calling the CLI from SCIO to run them all

  • Or we first complete the integration in PurlDB and instead add all these packages there and call the API in PurlDB

    • https://github.com/nexB/purldb/issues/373

pombredanne avatar Apr 04 '24 08:04 pombredanne

Some ideas:

Debian Packages

  • TBD

Popular packages in GitHub

Some packages in the news

  • redis
  • log4j

JavaScript on npm:

  • version of sqlite driver https://www.npmjs.com/package/sqlite3 both ELF and JS
  • Some JS with minified code TBD

Uberjars:

  • htrace versions such as:
    • https://repo1.maven.org/maven2/org/apache/htrace/htrace-core/4.0.0-incubating/htrace-core-4.0.0-incubating-sources.jar#from
    • https://repo1.maven.org/maven2/org/apache/htrace/htrace-core/4.0.0-incubating/htrace-core-4.0.0-incubating.jar#to

pombredanne avatar Apr 11 '24 15:04 pombredanne

I will start a spreadsheet with the list of packages

pombredanne avatar Apr 25 '24 16:04 pombredanne