cdxgen icon indicating copy to clipboard operation
cdxgen copied to clipboard

Generated purl for OCI images includes namespace, which is not allowed by purl spec

Open logicflakes opened this issue 6 months ago • 8 comments
trafficstars

When generating an SBOM using cdxgen with -t docker, the resulting BOM includes a purl of type oci that incorrectly uses the repository URL as the namespace, which is explicitly disallowed by the current purl specification.

Steps to Reproduce:

  1. Run:
cdxgen -t docker -o bom.json registry.relizahub.com/library/rearm-cli@sha256:696a2e4d457df5be966a4570d9695905b3d0afcf69d7728f0746d836504c4fce
  1. Observe in bom.json:
"purl":"pkg:oci/registry.relizahub.com/library/rearm-cli@sha256:696a2e4d457df5be966a4570d9695905b3d0afcf69d7728f0746d836504c4fce"

Problem:

According to the purl spec for OCI:

OCI purls do not contain a namespace, although, repository_url may contain a namespace as part of the physical location of the package.

This means:

  • The namespace should be omitted from the purl.

  • Information like registry.relizahub.com/library or ghcr.io/org/ should instead go into a repository_url field.

Including the namespace violates the spec and may cause issues with tooling that strictly parses purl. There's an open issue for this on purl-spec as well OCI PURL type should allow namespace declaration #425

Suggested Fix:

  • Remove the namespace segment from OCI purls.

  • Move repository or registry details to a repository_url qualifier (e.g., pkg:oci/rearm-cli@sha256:696a2e4d457df5be966a4570d9695905b3d0afcf69d7728f0746d836504c4fce?repository_url=registry.relizahub.com/library).

Version Info:

cdxgen version: 11.2.6

References:

logicflakes avatar May 14 '25 10:05 logicflakes

Thank you for this report. We need some time to think this through since it's a breaking change.

prabhu avatar May 14 '25 11:05 prabhu

@setchy any thoughts on this issue?

prabhu avatar May 29 '25 13:05 prabhu

Thanks for reporting @logicflakes - looks like we need to update our OCI parsing logic to be compliant with the purl spec. I'm curious how downstream platforms like Dependency Track would display the repository details.

setchy avatar Jun 05 '25 12:06 setchy

This is another purl weirdness. I am now facing a situation where the repository_url is not the same as the namespace. OCI images can be published to multiple registries from the same repository.

An ideal spec should have no opinion about the namespace or the name attribute.

prabhu avatar Jul 22 '25 03:07 prabhu

This is another purl weirdness. I am now facing a situation where the repository_url is not the same as the namespace. OCI images can be published to multiple registries from the same repository.

Could you give an example of this? My understanding was that repository_url essentially means registry URL, so images pushed to multiple registries should have distinct Purls.

taleodor avatar Jul 22 '25 12:07 taleodor

Repository could be on github.com or codeberg.org, and registry could be on quay.io or even ghcr.io.

prabhu avatar Jul 22 '25 12:07 prabhu

I believe that this does not mean source code repository.

From the current purl spec definition (https://github.com/package-url/purl-spec/blob/main/PURL-TYPES.rst):

repository_url: A repository URL where the artifact may be found, but not intended as the only location. This value is encouraged to identify a location the content may be fetched.

So this is a registry URL, and currently the suggestion is to arbitrarily pick one if the image can be found in different locations.

Personally, I don't find this ideal, but the definition is clear enough for me for implementations.

So in your example, we should pick either quay.io or ghcr.io.

taleodor avatar Jul 22 '25 12:07 taleodor

I'm not convinced that the cost of breaking all the downstream tools (including my own) is worth it in this case. OCI images are always referred to using the full name, including the registry, unless there is a default registry setting in the client tools. Even then, it's a security nightmare, and most people recommend referring to images using the full name, including the hash.

I think I'm going to stop at 80% or 90% compliant with purl (and even CycloneDX for that matter).

prabhu avatar Jul 22 '25 12:07 prabhu