DJC: Design an enhanced DejaCode Package model to identify source code relationships
The working idea here is to come up with the best way to identify cross-package relationships, especially to be able to get to (1) the source code and (2) more complete copyright+license data, which usually comes from the source code.
We could start by displaying the values for contains_source_code, source_packages, code_view_url, and vcs_url in the "Detected Package" section of the Scan tab (when a value is available). The ScanCode package model has this support for source code relationships (which are also in PurlDB):
-
the `contains_source_code` boolean flags tells if the package itself contains source code: https://github.com/nexB/scancode-toolkit/blob/0465269543eb338086c10bdeb1e81d3013522b4d/src/packagedcode/models.py#L452 -
the `source_packages` field is a list of Package URLs that may exist for this package https://github.com/nexB/scancode-toolkit/blob/0465269543eb338086c10bdeb1e81d3013522b4d/src/packagedcode/models.py#L457 -
the `code_view_url` and `vcs_url` provide reference URLs to view or fetch actual source code https://github.com/nexB/scancode-toolkit/blob/0465269543eb338086c10bdeb1e81d3013522b4d/src/packagedcode/models.py#L414
Now that we have standardized on PURL as the package identifier, we should be able to pursue this DejaCode improvement using package-set values via integration with the PurlDB.
See https://docs.google.com/document/d/1fs9W27A0aT-RDnNs_I3HKvbA6JCaDwGa/edit?usp=sharing&ouid=117241222429542576816&rtpof=true&sd=true