cyclonedx-cli icon indicating copy to clipboard operation
cyclonedx-cli copied to clipboard

Bug in merging json SBOMs with empty component lists

Open Taha-cmd opened this issue 1 year ago • 3 comments

When merging multiple SBOMs and specifying the --name and --version arguments, then the top level components of the SBOMs must be added to the components list of the new merged SBOM. However, if the input SBOMs are missing the components property, then the top level components of the input SBOMs will not be added to the list of components of the merged SBOM.

Reproduction: Consider the following 3 minimal SBOMs

bom1.json

{
    "bomFormat": "CycloneDX",
    "specVersion": "1.5",
    "serialNumber": "urn:uuid:957c05a9-5f45-4fhm-aaa1-49df4c08c61a",
    "version": 1,
    "metadata": {
      "timestamp": "2024-04-18T11:24:03Z",
      "component": {
        "type": "container",
        "bom-ref": "container1",
        "name": "container1",
        "version": "1",
        "purl": "container1@1"
      }
    },
    "dependencies": [
      {
        "dependsOn": [],
        "ref": "container1@1"
      }
    ]
  }

bom2.json

{
    "bomFormat": "CycloneDX",
    "specVersion": "1.5",
    "serialNumber": "urn:uuid:957c05a9-5f39-4fca-aaa1-49df4c08c61a",
    "version": 1,
    "metadata": {
      "timestamp": "2024-04-18T11:24:03Z",
      "component": {
        "type": "container",
        "bom-ref": "container2",
        "name": "container2",
        "version": "2",
        "purl": "container2@2"
      }
    },
    "dependencies": [
      {
        "dependsOn": [],
        "ref": "container2@2"
      }
    ]
  }

bom3.json

{
    "bomFormat": "CycloneDX",
    "specVersion": "1.5",
    "serialNumber": "urn:uuid:957c05a9-5f39-4fca-69a1-69df4c08c61a",
    "version": 1,
    "metadata": {
      "timestamp": "2024-04-18T11:24:03Z",
      "component": {
        "type": "container",
        "bom-ref": "container3",
        "name": "container3",
        "version": "3",
        "purl": "container3@3"
      }
    },
    "dependencies": [
      {
        "dependsOn": [],
        "ref": "container3@3"
      }
    ]
  }

merged.json

Now let's merge the 3 input SBOMs: cyclonedx-win-x64.exe merge --input-files "bom1.json" "bom2.json" "bom3.json" --output-file "merged.json" --name "merged" --version "merged"

Result:

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.5",
  "serialNumber": "urn:uuid:adb22d1b-748e-4731-93ce-9d360816a3c9",
  "version": 1,
  "metadata": {
    "component": {
      "type": "application",
      "name": "merged",
      "version": "merged"
    }
  },
  "dependencies": [
    {
      "ref": "container1@1",
      "dependsOn": []
    },
    {
      "ref": "container2@2",
      "dependsOn": []
    },
    {
      "ref": "container3@3",
      "dependsOn": []
    }
  ]
}

As you can see in the result, the components property is missing and the top level components of the input SBOMs are lost. Interestingly, components would be added only after first input SBOM that contains a components property is merged. If bom2.json contains the components property, then the result would be:

bom2.json

{
    "bomFormat": "CycloneDX",
    "specVersion": "1.5",
    "serialNumber": "urn:uuid:957c05a9-5f39-4fca-aaa1-49df4c08c61a",
    "version": 1,
    "metadata": {
      "timestamp": "2024-04-18T11:24:03Z",
      "component": {
        "type": "container",
        "bom-ref": "container2",
        "name": "container2",
        "version": "2",
        "purl": "container2@2"
      }
    },
    "dependencies": [
      {
        "dependsOn": [],
        "ref": "container2@2"
      }
    ],
    "components": [] # The components property is now present
  }

merged.json

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.5",
  "serialNumber": "urn:uuid:fa30d92e-4d22-4c38-9337-7f2ec7ea0e0a",
  "version": 1,
  "metadata": {
    "component": {
      "type": "application",
      "name": "merged",
      "version": "merged"
    }
  },
  "components": [
    {
      "type": "container",
      "bom-ref": "container2",
      "name": "container2",
      "version": "2",
      "purl": "container2@2"
    },
    {
      "type": "container",
      "bom-ref": "container3",
      "name": "container3",
      "version": "3",
      "purl": "container3@3"
    }
  ],
  "dependencies": [
    {
      "ref": "container1@1",
      "dependsOn": []
    },
    {
      "ref": "container2@2",
      "dependsOn": []
    },
    {
      "ref": "container3@3",
      "dependsOn": []
    }
  ]
}

Taha-cmd avatar Apr 18 '24 12:04 Taha-cmd

Also, it worth mentioning that that the components property is being lost during the conversion from XML to JSON, which is probably another bug.

Reproduction:

bom1.json

{
    "bomFormat": "CycloneDX",
    "specVersion": "1.5",
    "serialNumber": "urn:uuid:957c05a9-5f45-4fhm-aaa1-49df4c08c61a",
    "version": 1,
    "metadata": {
      "timestamp": "2024-04-18T11:24:03Z",
      "component": {
        "type": "container",
        "bom-ref": "container1",
        "name": "container1",
        "version": "1",
        "purl": "container1@1"
      }
    },
    "dependencies": [
      {
        "dependsOn": [],
        "ref": "container1@1"
      }
    ]
  },
  "components": [] # 'components' property is present in the original SBOM

bom1.xml

convert json to xml with cyclonedx-win-x64 convert --input-file "bom1.json" --output-file "bom1.xml"

<?xml version="1.0" encoding="utf-8"?>
<bom xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" serialNumber="urn:uuid:957c05a9-5f45-4fhm-aaa1-49df4c08c61a" version="1" xmlns="http://cyclonedx.org/schema/bom/1.5">
  <metadata>
    <timestamp>2024-04-18T11:24:03Z</timestamp>
    <component type="container" bom-ref="container1">
      <name>container1</name>
      <version>1</version>
      <purl>container1@1</purl>
    </component>
  </metadata>
  <components />
  <dependencies>
    <dependency ref="container1@1" />
  </dependencies>
</bom>

bom1.json

Now convert the xml SBOM back to json convert json to xml with cyclonedx-win-x64 convert --input-file "bom1.xml" --output-file "bom1.json"

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.5",
  "serialNumber": "urn:uuid:957c05a9-5f45-4fhm-aaa1-49df4c08c61a",
  "version": 1,
  "metadata": {
    "timestamp": "2024-04-18T11:24:03Z",
    "component": {
      "type": "container",
      "bom-ref": "container1",
      "name": "container1",
      "version": "1",
      "licenses": [],
      "purl": "container1@1"
    },
    "licenses": [],
    "lifecycles": []
  },
  "dependencies": [
    {
      "ref": "container1@1"
    }
  ],
  "vulnerabilities": [],
  "annotations": [],
  "properties": [],
  "formulation": []
# The components property is missing here
}

Taha-cmd avatar Apr 18 '24 12:04 Taha-cmd

As you can see in the result, the components property is missing and the top level components of the input SBOMs are lost. Interestingly, components would be added only after first input SBOM that contains a components property is merged. If bom2.json contains the components property, then the result would be:

This behavior is caused by the check result.Components != null here: https://github.com/CycloneDX/cyclonedx-dotnet-library/blob/57972c202d267366954599a948445196cedd0dda/src/CycloneDX.Utils/Merge.cs#L84-L88

In general, the question would be what exactly the semantics of the flat merge need to be, compare also https://github.com/CycloneDX/specification/discussions/320 In particular, what needs to happen with the components in the metadata of the BOMs that are merged and what should be the component in the metadata of the merged BOM?

andreas-hilti avatar Jun 06 '24 20:06 andreas-hilti

In particular, what needs to happen with the components in the metadata of the BOMs that are merged and what should be the component in the metadata of the merged BOM?

I believe that the current design of the flat merge command when specifying the --name and --version options is fine. However, when skipping these options, the command behaves completely differently and has different semantics. Making the top level component of the first input argument as the top level component of the merged SBOM seems like an arbitrary decision and has the semantics of an Add Sub-SBOM functionality. In its current implementation, the command is just trying to do too much.

Taha-cmd avatar Jul 15 '24 13:07 Taha-cmd