specification icon indicating copy to clipboard operation
specification copied to clipboard

[FEATURE]: Proposed changes for MLBOM schema for CycloneDX 2.0

Open mrutkows opened this issue 3 months ago • 2 comments

Describe the feature

This issue will capture proposed changes and action items raised as part of the MLBOM work group towards improving the ML schema for CycloneDX 2.0

  • fields in modelParameters should be moved to top-level modelCard schema
  • explore releaseNotes being made plural to account for the same (identical) component being released simultaneously to different platforms/repositories (i.e., models released to HF, Ollama, etc.)
    • We agree "best practices" should be written on how to account for this use case on when/how to use multiple release notes and assure the component identity is the same...
  • architectureFamily as a simple string may not be enough as there are now more commonly many hybrid architectures with a increasing rate of divergence (just assuming transformers and also taking into account models that include multiple models such as a smolDocling or tableFormer model...
    • Consider if this field has value IF we actually allow a more precise desc. of the architectural "layers" inside the modelArchitecture if redesigned to do so?
    • Additionally, consider things like dense, sparse, moe, etc.
  • Redesign modelArchitecture to allow a description of the layers that compose the model. -> TODO @mrutkows
  • modelParameters should reflect # of "learned" parameters
    • Note: EU CRA says each parameter needs to be described??? Need more info. on this as for a BOM this would be unsupportable as well as impossible to derive from any scanning tool
  • task needs to be plural i.e., tasks
    • TODO: does this need to be a string? an enum? more complex?
  • NEW: Add trainingConsiderations to describe the training processes
    • TODO: more discussion/design needed
  • TODO: Discuss reworking of inputs and outputs (both strings) is not clear... assume by the desc. that these are really (chat) template parameters (which may vary by template as models can have multiple)
  • TODO: Need to add a description of required (or fixed) hyperparameters (e.g., params.json)
    • Many models require certain params and will NOT work properly (invalid results) if not set properly (e.g., image model clip rects, guardrails models need temperature set to zero, etc.)

Additional considerations

  • Extend modelCards to allow for similar, new concepts for "system cards" (system level usage if a model) and "agent cards" (models used in agentic instances) which are being adopted by model providers.

mrutkows avatar Oct 01 '25 14:10 mrutkows

Check out BBQ and an example if "ethical" bias and accuracy measures: https://build.nvidia.com/nvidia/nvidia-nemotron-nano-9b-v2/modelcard

and specifically BBQ project here: https://github.com/nyu-mll/BBQ

mrutkows avatar Oct 01 '25 15:10 mrutkows

Recap of Meeting on Oct 1, 2025

Overview:

  • Discussed parameters: We do not list all parameters for a model. Instead, we ask Model PICs to list the total number of parameters.
  • Units are:
    • Million - m
    • Billion - B
    • Trillion - T
  • Input/Output: Describes tensor shapes
  • Hyperparameters - Temperature

Bias: -Ethical Considerations - specific to bias

  • Bias Metric - Do we have consensus around specific evals? For LLMs, we use BBQ (developed by Google) and we point to BBQ here. Example: See bias subcard: nvidia-nemotron-nano-9b-v2 Model by NVIDIA | NVIDIA NIM -Greatest difference in performance: Here, we are referring to accuracy -Matt mentioned performance tradeoffs, but that this category seems unclear

Explainability: -Most of the responses here are arrays of strings -Known Risks, Intended Users, and Technical Limitations may not yet be covered.

Next Steps: -Discuss Privacy and Safety subcards -Action Items (to discuss next meeting): -Matt to take a look at Train/Test/Eval section -Michael to help develop taxonomy recommendations

Links: -Cyclone DX components: https://cyclonedx.org/docs/1.6/json/#components_items_type -BBQ repo: GitHub - nyu-mll/BBQ: Repository for the Bias Benchmark for QA dataset.

DSYared avatar Oct 01 '25 17:10 DSYared