temurin-build icon indicating copy to clipboard operation
temurin-build copied to clipboard

SBoM nightly artifact data names do not align with the published ones

Open sxa opened this issue 1 year ago • 8 comments

What are you trying to do? Download the SBoM for a build and look at it's contents

Expected behaviour: Filenames in the SBoM match the filenames in GitHub

Observed behaviour: To use an example, for the latest nightly at the time of writing the Linux/x64 SBoM references this within the components section:

  • OpenJDK17U-jdk_x64_linux_hotspot_2024-01-07-12-05.tar.gz

when the artifact is actually named:

This is because we rename the files at publish time to make the names across all platforms consistent. This means we have a discrepancy between the name in the SBoM and the name a user will download.

Any other comments: The SHA256 checksums within the SBoM are still correct.

FYI @andrew-m-leonard @netomi Ref: https://github.com/adoptium/temurin-build/pull/3529 where the information was added into the SBoM. I'm not immediately sure what the fix is here unless we fix the filename on input to the build (for example from the trigger job). This should not affect the release builds, only nightlies since I don't believe they get renamed. For this reason it will not be critical to get this fixed prior to next week's release.

sxa avatar Jan 08 '24 10:01 sxa

I would have no clue how to fix that right away. I wonder which timestamp is used for the final nightly artifacts? Is this the timestamp when the release is made in the temurin-binaries repo by running the respective script? If the renaming is used to have a common suffix for all artifacts for each nightly build, would it not also work if the common suffix would be the starting timestamp of the nightly build, or can such a timestamp not be determined for all e.g. architectures as they start at different times and are not triggered by some coordinated job?

netomi avatar Jan 08 '24 21:01 netomi

I have done a test to verify, and this does NOT affect release builds, as release build artifacts are created with the release tag in the Filename which is what gets published. So the urgency of this issue is not as high.

The Publish job renames any artifact that matches a "Timestamp" in the filename (hence why release artifacts are not renamed), changing the timestamp to the specified nightly build publish job parameter TIMESTAMP value. @netomi The fix for this is to patch the Timestamp in the SBOM json in the publish job, so probably about here: https://github.com/adoptium/github-release-scripts/blob/4a57826d3821cd0839e95572668c4d750bbe5fc6/sbin/Release.sh#L136 if it is the SBOM file, then patch the SBOM file content using a similar sed command used for the rename, but being a bit more specific to target the archive full name, probably use a version of this regex but with specific timestampRegex within it: https://github.com/adoptium/github-release-scripts/blob/4a57826d3821cd0839e95572668c4d750bbe5fc6/sbin/Release.sh#L44

andrew-m-leonard avatar Jan 09 '24 09:01 andrew-m-leonard

the sbom file can not be patched as it is signed during the build so the signature would not be valid anymore afterwards, thats why I meant it not clear to me how to fix it as the final name must be known when the sbom is created.

netomi avatar Jan 09 '24 09:01 netomi

the sbom file can not be patched as it is signed during the build so the signature would not be valid anymore afterwards, thats why I meant it not clear to me how to fix it as the final name must be known when the sbom is created.

ah good point! I think then we need to re-think why we do the renaming...I think that logic needs removing

andrew-m-leonard avatar Jan 09 '24 09:01 andrew-m-leonard

The "build time" Filename (which ends up in the SBOM), is determined here for Nightly builds: https://github.com/adoptium/ci-jenkins-pipelines/blob/4453b2cfee872526542a5b0a34c4d073537c3df9/pipelines/build/common/openjdk_build_pipeline.groovy#L1340 This will be the current time on the build machine when each platform build is run... The upstream build pipeline job then calls the Publish job generating a new current TIMESTAMP parameter value, so that all the artifacts for the published pipeline have the same timestamp in the filename, here: https://github.com/adoptium/ci-jenkins-pipelines/blob/4453b2cfee872526542a5b0a34c4d073537c3df9/pipelines/build/common/build_base_file.groovy#L778

@sxa @smlambert For nightly builds, do we really need to rename the artifact timestamps so they all match ? As we're invalidating the signed SBOM as a consequence...

andrew-m-leonard avatar Jan 09 '24 09:01 andrew-m-leonard

For nightly builds, do we really need to rename the artifact timestamps so they all match ? As we're invalidating the signed SBOM as a consequence...

IMHO it's confusing if they don't match the release name or is inconsistent if I download say, and x64 and aarch64 build so I do think we should have them in sync.

I'm not immediately sure what the fix is here unless we fix the filename on input to the build (for example from the trigger job)

This may be the easiest solution and something we can implement for the triggered builds as a first step, then decide what to do for the ones we're still running on a schedule (Maybe the answer is to have a different trigger job that can pass in a fixed timestamp for the process)

sxa avatar Jan 09 '24 11:01 sxa

we fix the filename on input to the build (for example from the trigger job)

yes agree, lets fix the timestamp in the pipeline and remove the Publish renaming

andrew-m-leonard avatar Jan 09 '24 13:01 andrew-m-leonard

The work in https://github.com/adoptium/ci-jenkins-pipelines/issues/902 is expected to fix this.

sxa avatar Jan 30 '24 14:01 sxa