Support image history in Docker image scans
Description
Docker image scanning is currently restricted to file contents in layers. There is another source of secret leakage in container images however: image history.
For example, if an image is created with an arg whose value is a secret, that value becomes part of a plain text history record stored with the image. This is something observed in the wild, and I'd like to improve this tool as we use it for detection.
If I use crane config to output the details of an affected image, it will show something like this:
$ crane config image:label | jq
{
"architecture": "arm64",
"config": {
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
// redacted
],
"Entrypoint": [
"pnpm",
"run"
],
"WorkingDir": "/app",
"Labels": {
// redacted
},
"OnBuild": null
},
"created": "2024-05-24T05:27:08.832604159Z",
"history": [
// partially redacted
{
"created": "2024-05-24T05:26:37.56186972Z",
"created_by": "RUN |8 PNPM_VERSION=v8.15.5 GITHUB_REGISTRY_TOKEN=ghp_ohmygodnessaleakedvalue /bin/sh -c unwise-command '${GITHUB_REGISTRY_TOKEN}' ",
"comment": "buildkit.dockerfile.v0"
},
],
"os": "linux",
"rootfs": {
"type": "layers",
"diff_ids": [
// redacted
]
},
"variant": "v8"
}
The created_by field of each entry contains the value of the build argument used when creating the image.
Preferred Solution
Modify the trufflehog docker command such that history elements are scanned by default as well as layer contents.
Additional Context
I have a basic prototype implementation in #2882. Hopefully this will help discussion if nothing else.