stereoscope
stereoscope copied to clipboard
NewTarIndex race condition when run in a goroutine
What happened:
In our build system we pull+cache images using crane in an OCI layout format. We then attempt to SBOM these images which may share layer tarballs.
Concurrently running Image.Read
results in the following error being returned:
l.indexedContent, err = file.NewTarIndex(
tarFilePath,
layerTarIndexer(tree, l.fileCatalog, &l.Metadata.Size, l, monitor),
)
if err != nil {
return fmt.Errorf("failed to read layer=%q tar : %w", l.Metadata.Digest, err)
}
failed to read
layer="sha256:994393dc58e7931862558d06e46aa2bb17487044f670f310dffe1d24e4d1eec7" tar : unexpected EOF
I believe this is caused by stereoscope re-using file handles in os.Open
resulting in a race condition of reads.
What you expected to happen:
Image SBOMing is go-routine safe w/ images that share layers.
How to reproduce it (as minimally and precisely as possible):
In a system w/ low IO speed (Github's default runner works):
- Pull multiple images that share layers across them and store as OCI layout (images w/
alpine
base work) - Attempt to go-routine image SBOMing
Anything else we need to know?:
Full stereoscope + syft logs in our CI run: https://github.com/defenseunicorns/zarf/actions/runs/5485435230/jobs/9994240328?pr=1887
Environment:
- Github default runner
ubuntu-latest
Hi @Noxsios, thanks for filing this issue. Could you tell us a bit more about how you are making use of Syft and Stereoscope in this environment? Are you calling them as libraries? We are trying to sort through the logs but we could use more guidance from you.
I'll save the relevant logs here for posterity.
Thanks for reaching out @tgerla!
Our tool zarf
is a CLI designed to package up cloud native (docker/k8s) artifacts and be able to deploy them into air gapped (no connection) environments. During this process we SBOM all the files+images on a per-component level. An example component:
- name: flux
description: Installs the flux CRDs / controllers to use flux-based deployments in the cluster
required: true
manifests:
- name: flux-crds
namespace: flux
files:
- flux-install.yaml
images:
- ghcr.io/fluxcd/kustomize-controller:v0.27.1
- ghcr.io/fluxcd/source-controller:v0.28.0
This component has 2 images, and we use Crane to pull the images and Syft/Stereoscope to perform the SBOMing, then include that resulting artifact in the package.
We are calling both Syft and Stereoscope as libraries main function we are calling. This functionality works 100% of the time when run in series, but fails with the previous error message when run in goroutines. The only difference between these two is this concurrency of operations. The layer sha256:994393dc...
is a common base layer in many images, and I believe Stereoscope is attempting to read it multiple times, and getting to EOF sooner than expected on one or more threads. This behavior does not happen on our dev machines as their disk IO is able to read the layers concurrently fast enough that the threads do not clobber, but on low IO speed systems (such as Github Actions) the error occurs.
Other relevant info:
The layer cache directory given to image.New
looks like the following:
❯ tree ~/.zarf-cache/images/
~/.zarf-cache/images/
├── sha256:094713c74b5761ab52fa3a95cba028e262ead0e2f4ed41857d8ae2bcd3d95f08
├── sha256:0a24540c1e5a47e5a8116c43151cb45c985f7a65afe5d4d65127d9c6fea2ffcf.tar
├── sha256:1a52c5e5391055e32568bc09f13c5a95989b93370c204d7c6d62bd0bd6e37375.tar
├── sha256:1e69fba811c05a87a92708bba2ae4c2d149fd236d853fbe83478066f77cb07a2
├── sha256:21149bf69d91e8a4c3ee46fdbd7c5ba54ee67da12abc51d4c1a29026f1ed0b7a.tar
├── sha256:37b289feee660235184d9711a14607ba91cecb5dd6b0c5402f0edd6cc87264f4.tar
├── sha256:3bdcce4584f90d548aa0d42b8358675c4ebcd6b548251799735c1a2fa454884f
...
Hope this helps!
Hi @Noxsios -- we'd love to get this fixed; would you be able to provide a very simple sample Go application that uses stereoscope in a way that exhibits this behavior (even with something like hardcoded image tags/etc.)? It would definitely help us out to get it addressed more quickly... Thanks!
@kzantow Working on it! Ran into a few roadblocks getting the minimum POC working as our project is on syft v0.84.1
but replicating this code resulting in compilation errors in github.com/anchore/syft/syft/pkg/cataloger/java
. I upgraded to v0.88.0
, but this has breaking changes in the way that images are treated. source.NewFromImage
no longer exists and I am currently working on:
src, location, err := image.DetectSource(imagesDir)
if err != nil {
log.Fatal(err)
}
log.Printf("Detected source %q for %q", src, location)
cfg := source.StereoscopeImageConfig{
Reference: imagesDir,
From: src,
}
syftSource, err := source.NewFromStereoscopeImage(cfg)
if err != nil {
log.Fatal(err)
}
catalog, relationships, distro, err := syft.CatalogPackages(syftSource, cataloger.DefaultConfig())
if err != nil {
log.Fatal(err)
}
However, the OciDirectory
does not currently support an OCI layout with more than 1 image stored. Also as far as I can tell, Reference
is getting used as the directory source, not the image.Source
returned from image.DetectSource
, which confuses me.
# pkg/image/oci/directory_provider.go:44
// for now, lets only support one image indexManifest (it is not clear how to handle multiple manifests)
if len(indexManifest.Manifests) != 1 {
return nil, fmt.Errorf("unexpected number of OCI directory manifests (found %d)", len(indexManifest.Manifests))
}
The WIP POC lives here: https://github.com/Noxsios/goroutine-syft/blob/main/main.go . I will update if I can successfully replicate.
Current output:
Layer "sha256:9398808236ffac29e60c04ec906d8d409af7fa19dc57d8c65ad167e9c4967006" is repeated 3 times
Detected source "OciDirectory" for "/home/thicc/dev/goroutine-syft/images"
unable to load image: unable to use OciDirectory source: unexpected number of OCI directory manifests (found 4)
This gets me much closer, but image.Image
's v1.Image field is private.
layoutPath := layout.Path(imagesDir)
v1Img, err := layoutPath.Image(digest)
if err != nil {
log.Fatal(err)
}
img := image.Image{}
imgSrc, err := source.NewFromStereoscopeImageObject(&img, digest.String(), nil)
if err != nil {
log.Fatal(err)
}
catalog, relationships, distro, err := syft.CatalogPackages(imgSrc, cataloger.DefaultConfig())
if err != nil {
log.Fatal(err)
}
Thanks @Noxsios, we're going to need to spend a little bit more time investigating this so we'll put it in our backlog.