rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Rebase Buildpack Contributed Layers

Open jabrown85 opened this issue 4 months ago • 7 comments

Rendered

jabrown85 avatar Sep 12 '25 14:09 jabrown85

Maintainers,

As you review this RFC please queue up issues to be created using the following commands:

/queue-issue <repo> "<title>" [labels]...
/unqueue-issue <uid>

Issues

(none)

buildpack-bot avatar Sep 12 '25 14:09 buildpack-bot

This is great!

One thing I was hoping for is the ability to rebase the launcher layer as well individual buildpack layers. What do you think of the revised config file below.

{
  "patches": {
    "buildpacks": [
      {
        "buildpack": "heroku/nodejs",
        "layer": "dist",
        "data": {
          "artifact.version": "24.7.*"
        },
        "patch-image": "registry.example.com/patches/nodejs:24.7.2",
        "patch-image.mirrors": [
          "backup-registry.example.com/patches/nodejs:24.7.2"
        ]
      }
    ],
    "launcher": {
      "must-match-sha": [
        "sha256:exampledigest1234567890abcdef1234567890abcdef1234567890abcdef1234567890",
        "sha256:exampledigest2234567890abcdef1234567890abcdef1234567890abcdef1234567890"
      ],
      "patch-image": "registry.example.com/patches/nodejs:24.7.2",
      "patch-image.mirrors": [
        "backup-registry.example.com/patches/nodejs:24.7.2"
      ]
    }
  }
}

The same Patch Image Creation process would apply for the launcher layer as well.

I know that users can specify different lifecycle images at build time, or use different builders with different lifecycle versions. The must-match-sha allows users to restrict lifecycle layer rebasing/patching to only known digests that are presumably determined by some image scanning process that identifies layers with vulnerabilities. If must-match-sha is omitted then the layer would be updated (assuming it's not already pointing to the same digest as the patched image layer).

At present it looks like the only metadata stored for launcher (in io.buildpacks.lifecycle.metadata) is the sha, but because it is a dictionary we could start adding additional metadata such as version or even image uri. This metadata could then be used to filter when to patch the launcher layer rather than relying on sha. This would be more user-friendly and allow users to know they are patching from version x to version y.

jericop avatar Sep 12 '25 20:09 jericop

I think for many languages swapping out any runtime layer might be too high of a risk, even if it "should" work 90% of the time. For example:

  • When upstream developers think about (or refer to) to compatibility, they often only consider backwards compatibility and not forwards compatibility. Depending on what's swapped out we might end up in the latter scenario, when only the former is expected/supported by the language upstream.
  • Patch versions of languages have been known to have regressions (even "correct" bug fixes can be breaking), and some customers might find having these silently introduced be surprising (/unacceptable)

From our discussion on Salesforce Slack, it sounds like this feature is perhaps aimed at simpler use-cases than language runtimes - however, the RFC uses language runtimes for many of the examples, which makes it harder to understand the true use-cases/value this might offer. Would it be possible to swap the examples to something closer to what we think might be safe enough to use this feature with?

Also, reading the RFC I can't tell exactly how the patches are advertised/distributed? (Due to the verbosity of the LLM generated text I had to skim rather than read fully, so I could have missed something) Specifically, I think it would be essential that users can run pack rebase locally against an image produced using a specific builder image and get the same rebased image as they would if their platform of choice rebased it for them. However, this depends on the patches being publicly available and discoverable via Pack too and not just some internal platform mechanism.

I'm also slightly concerned about how we explain this to users - I think for base image updates the "we replay the entire top of the image on a new base image" is slightly easier to explain and users are more likely to be ok with it. But a "we selectively patch some layers of your image depending on if the SHA256 matches some internal manifest you can't see or might not even know exists" will be potentially more confusing to them / leave them more red herrings when it comes to "my app stopped working, did your platform break my app" type support tickets?

Lastly, it seems that best practices for production apps are moving towards methods that provide more determinism and predictable deploys, not less. For example best practices like:

  • using a lockfile for app dependencies / other versions and having Dependabot update them in a controlled way
  • using immutable images that ensure changes are only rolled out via the approved CI/CD pipeline
  • using a CI/CD pipeline where there are multiple stages (staging/canary/prouduction), where a deploy is promoted from one stage to another with health checks/delays between each stage

...whereas this proposal seems aimed at making images more mutable / easier for changes to bypass the CI/CD process?

edmorley avatar Sep 16 '25 09:09 edmorley

After some comments and discussion with other stakeholders I believe there are a few things that this RFC should include.

  1. The data key in patches identifies layer metadata that is added by buildpack authors at build time. Layers that are rebase-able should have some additional metadata that buildpacks authors can add in order to let pack know which layers are safe to rebase. This will make it clear to platforms/users which layers can be rebased and prevent rebasing other layers by default.

Below is an example of what this might look like. The first option is a single boolean key/value called io.buildpacks.rebasable which must be set to true. The second option is an additional map/dictionary key/value called rebase where metadata specific to rebasing can be added such as a boolean key rebasable that would need to be true. The first option is simple and requires no additional input from buildpack authors regarding rebasing. The second option allows buildpacks authors to add additional details which could be useful down the road.

The (simplified) output below was taken from a test app built with the paketo azul-zulu buildpack and a custom buildpack.

docker inspect test-app | jq -r '.[0].Config.Labels."io.buildpacks.lifecycle.metadata"' | jq

{
  "app": [
    {
      "sha": "sha256:somesha"
    }
  ],
  "sbom": {
    "sha": "sha256:somesha"
  },
  "buildpacks": [
    {
      "key": "paketo-buildpacks/azul-zulu",
      "version": "11.2.4",
      "layers": {
        "helper": {
          "sha": "sha256:somesha",
          "data": {
            "buildpackInfo": {},
            "helperNames": [],
            "io.buildpacks.rebasable": true,
            "rebase": {
                "rebasable": true
            }
          },
          "build": false,
          "launch": true,
          "cache": false
        },
        "java-security-properties": {},
        "jre": {}
      }
    },
    {
      "key": "my-buildpack",
      "version": "0.0.0",
      "layers": {
        "stemcell": {
          "sha": "sha256:somesha",
          "data": {},
          "build": false,
          "launch": true,
          "cache": false
        }
      }
    }
  ],
  "config": {},
  "launcher": {
    "sha": "sha256:somesha"
  },
  "process-types": {},
  "runImage": {},
  "stack": {}
}

If we try to rebase the image above, which has the required rebase-able metadata, then pack would allow rebasing without complaint.

What if the user tries to rebase layers in an image that are missing rebase-able metadata? Since pack rebase already supports the --force flag, it should still allow rebasing the layers and should output a warning informing the user that the layers being rebased are not explicitly rebase-able and that pack is proceeding because of the --force flag. This is in line with the current rebaser behavior outlined in the spec here.

  1. The RFC makes no mention of sbom data in its current form, but it should be updated to make it clear that sbom data for rebased layers will be updated to reflect the changes.

I used dive to inspect the same container mentioned above and see there is a layer called Software Bill-of-Materials with the following directory structure. Assuming a user wants to rebase the paketo-buildpacks azul-zulu helper layer, then the contents of the sbom from the source "patch" image must be written to the target image that is being rebased. This would require rebuilding the sbom layer locally and then pushing it to the registry as part of the overall rebase operation.

bash-5.2# tree /layers/sbom/
/layers/sbom/
`-- launch
    |-- buildpacksio_lifecycle
    |   `-- launcher
    |       |-- sbom.cdx.json
    |       |-- sbom.spdx.json
    |       `-- sbom.syft.json
    |-- paketo-buildpacks_azul-zulu
    |   |-- helper
    |   |   `-- sbom.syft.json
    |   `-- jre
    |       `-- sbom.syft.json
    `-- sbom.legacy.json

6 directories, 6 files
bash-5.2# 

I have mentioned a previous comment that rebasing the launcher layer is important for a platform/user that patching CVEs. Another reason I believe the launcher layer to be rebase-able through the same mechanism is that you could modify the sbom for all layers in a single operation. This would simplify implementation and the workflow for a platform/user that would use this feature for security patching.

jericop avatar Sep 19 '25 03:09 jericop

From our discussion on Salesforce Slack, it sounds like this feature is perhaps aimed at simpler use-cases than language runtimes - however, the RFC uses language runtimes for many of the examples, which makes it harder to understand the true use-cases/value this might offer. Would it be possible to swap the examples to something closer to what we think might be safe enough to use this feature with?

I agree that the current use case example may suggest swapping run times is a safe operation when it could actually be problematic depending on the language and other layers with packages. For me the initial use case for this RFC is security vulnerability fixes, which it does mention as a use case but is not the primary example. Perhaps using security patching as the primary example would be better.

Also, reading the RFC I can't tell exactly how the patches are advertised/distributed? (Due to the verbosity of the LLM generated text I had to skim rather than read fully, so I could have missed something) Specifically, I think it would be essential that users can run pack rebase locally against an image produced using a specific builder image and get the same rebased image as they would if their platform of choice rebased it for them. However, this depends on the patches being publicly available and discoverable via Pack too and not just some internal platform mechanism.

The thing that is appealing about this RFC in its current form is that it is entirely up to the platform/user to generate their own patched images. As stated in the RFC, the patch image generation process would involve doing a pack build and then extracting the "patched" layers from the resulting image. Bulidpack authors don't have to worry about publishing patches because their regular releases, which eventually include security fixes, will serve as patches that the platform/user will use.

If buildpack authors are responsible for creating and publishing "patch" images that are strictly intended for rebasing their own layers it will change the scope of this RFC and probably require changes to the spec to set the contracts for how these patch images should be built and consumed. While I am not against this discussion, I think it is outside the scope of what this RFC is addressing, which is an immediate need of allowing platforms/users to apply security fixes to rebase-able buildpacks-provided layers, as well as the launcher layer. In my opinion this feature would give platforms/users all they need for dealing with security fixes in layers without the need for buildpacks authors to worry about publishing patch images for specific layers. If that approach is important to folks, then I think that should be done in a subsequent RFC.

Lastly, it seems that best practices for production apps are moving towards methods that provide more determinism and predictable deploys, not less... ...whereas this proposal seems aimed at making images more mutable / easier for changes to bypass the CI/CD process?

My previous comment, where I suggest updating the RFC to make it clear that the sbom will be updated for any layers that are rebased, should preserve provenance and integrity. Any platform/user using rebasing today is already mutating their images, but they are doing so in a controlled manner. If they have automated rebasing based on a schedule or some trigger then that becomes part of their CI/CD process. I see this as a tool in the CI/CD process that will remove the burden from development teams so they don't have to build new images/releases just to deal with CVEs caused by "buildpacks."

jericop avatar Sep 19 '25 04:09 jericop

@edmorley @jericop I took a first quick pass at some of the feedback.

@edmorley what do you think about buildpacks having to set some metadata in the data: {} to enable rebase (without --force)?

@edmorley or @jericop Do you have any suggestions on patch distribution? pack rebase or lifecycle rebase is often executed on minimal images and not the builder. Maybe being able to provide a --builder <TARGET> to fetch metadata on the builder that could include a url to a patch file (or built in as a layer?). 🤔

jabrown85 avatar Sep 25 '25 15:09 jabrown85

@edmorley can you take a peek at this again?

jabrown85 avatar Nov 06 '25 18:11 jabrown85