cartography icon indicating copy to clipboard operation
cartography copied to clipboard

[FEATURE REQUEST] Ingest Github dependency information using GraphQL API

Open heryxpc opened this issue 1 year ago • 0 comments

Title: Ingest dependencies information as Dependency nodes using GH GraphQL API

Description: Currently, there is custom code to capture PythonLibrary dependencies https://github.com/lyft/cartography/blob/master/cartography/intel/github/repos.py#L523. This is not generic (only supports Python setup.cfg and requirements.txt files), not covering other ecosystem's dependencies.

If the dependency graph is enabled (see https://docs.github.com/en/code-security/supply-chain-security/understanding-your-software-supply-chain/configuring-the-dependency-graph), Github automatically parses the project dependency lock files to generate a comprehensive graph.

This information is available via Github's GraphQL API, like:

repository(owner: "lyft", name: "cartography") {
    dependencyGraphManifests(first: 10) {
      nodes {
        dependencies(first: 10) {
          nodes {
            packageName
            requirements
          }
        }
      }
    }
  }

The idea is building a generic module that can ingest a node named Dependency during Github ingestion to surface (at least) direct dependencies. This would be useful for software composition analysis to identify supply chain risks in any ecosystem supported by Github.

[optional Relevant Links:] https://docs.github.com/en/code-security/supply-chain-security/understanding-your-software-supply-chain/about-supply-chain-security https://docs.github.com/en/code-security/supply-chain-security/understanding-your-software-supply-chain/about-the-dependency-graph

heryxpc avatar Aug 28 '24 12:08 heryxpc