specification icon indicating copy to clipboard operation
specification copied to clipboard

Add a "libraries" or "dependencies" field to the machine-learning-model component

Open bardenstein opened this issue 2 years ago • 2 comments

Proposal: add a list of model libraries/dependencies that exist for a given ML model. This captures one of the most important pieces of information: what primary ML library a model relies on (and where to look for the source code for that model)

Name: "libraries" or "dependencies" Type: Array. Each would contain, at a minimum, a "name" field. (e.g. "name": "PyTorch") Required: True

Justification: Consider the following examples on HuggingFace (looking at the tags on the top) - knowing if a model is dependent on PyTorch, ONNX, Transformers, or Diffusers is very important information for transparency. I don't expect an AI BOM generator to go grab additional information about that library (such as license, source URL, etc.), but just knowing the name is sufficiently important, according to ML experts and developers I researched with.

https://huggingface.co/nateraw/vit-base-beans https://huggingface.co/tiiuae/falcon-7b-instruct

bardenstein avatar Aug 25 '23 17:08 bardenstein

Since a model is just another type of component, you can already specify that a model includes other components or that a model is dependent on other components.

I'm not seeing any gaps in the examples provided with PyTorch, Transformers, etc.

stevespringett avatar Aug 25 '23 21:08 stevespringett

Can you show a concrete example of something that CycloneDX cannot represent today?

stevespringett avatar Aug 25 '23 21:08 stevespringett