dejacode icon indicating copy to clipboard operation
dejacode copied to clipboard

DJC: Design an enhanced DejaCode Package model to identify source code relationships

Open DennisClark opened this issue 1 year ago • 1 comments

The working idea here is to come up with the best way to identify cross-package relationships, especially to be able to get to (1) the source code and (2) more complete copyright+license data, which usually comes from the source code.

We could start by displaying the values for contains_source_code, source_packages, code_view_url, and vcs_url in the "Detected Package" section of the Scan tab (when a value is available). The ScanCode package model has this support for source code relationships (which are also in PurlDB):

  • the `contains_source_code` boolean flags tells if the package itself contains source code: https://github.com/nexB/scancode-toolkit/blob/0465269543eb338086c10bdeb1e81d3013522b4d/src/packagedcode/models.py#L452
    
  • the `source_packages` field is a list of Package URLs that may exist for this package https://github.com/nexB/scancode-toolkit/blob/0465269543eb338086c10bdeb1e81d3013522b4d/src/packagedcode/models.py#L457
    
  • the `code_view_url` and `vcs_url` provide reference URLs to view or fetch actual source code https://github.com/nexB/scancode-toolkit/blob/0465269543eb338086c10bdeb1e81d3013522b4d/src/packagedcode/models.py#L414
    

Now that we have standardized on PURL as the package identifier, we should be able to pursue this DejaCode improvement using package-set values via integration with the PurlDB.

DennisClark avatar Feb 05 '24 18:02 DennisClark

See https://docs.google.com/document/d/1fs9W27A0aT-RDnNs_I3HKvbA6JCaDwGa/edit?usp=sharing&ouid=117241222429542576816&rtpof=true&sd=true

DennisClark avatar May 09 '25 20:05 DennisClark