syft icon indicating copy to clipboard operation
syft copied to clipboard

Support scanning files in other mount namespaces

Open ariel-miculas opened this issue 1 year ago • 9 comments

What would you like to be added: I want syft to be able to scan files in other mount namespaces.

Why is this needed:

  • to scan files in other docker containers
  • to scan host files while syft itself runs in a docker container
  • to scan files in other docker containers while syft itself runs in a docker container

Additional context: As an example, I'm trying to scan a file on my host filesystem by running the syft scanner inside a docker container. Here, pid 117851 is a process running in my host mount namespace, and the docker permissions allow me to access /proc/117851/root/.

❯ docker run --rm --cap-add CAP_SYS_ADMIN --cap-add CAP_SYS_PTRACE --pid=host --security-opt apparmor=unconfined anchore/syft scan /proc/117851/root/home/amiculas/ran


unable to get file resolver: unable to get absolute path for analysis path="/proc/117851/root/home/amiculas/ran": lstat /home: no such file or directory

syft cannot find /home because it's looking for it in its own mount namespace, instead of looking for it in the host's mount namespace. To work as expected, something like procfsroot should be used.

Another way to make this work would be to use this file path as-is, instead of trying to resolve any paths. But the way syft's FileResolver from file_source.go works is by calling fileresolver.NewFromDirectory which ends up calling filepath.EvalSymlinks(...). This doesn't work for proc/PID/root paths because proc/PID/root is a symlink to /, but / refers to the root from the mount namespace that syft is running in, not the target mount namespace that needs to be scanned.

As a proof that this feature should be possible, I'm running trivy from a docker container set up identically:

❯ docker run --rm --cap-add CAP_SYS_ADMIN --cap-add CAP_SYS_PTRACE --pid=host --security-opt apparmor=unconfined aquasec/trivy rootfs --format=spdx-json /proc/117851/root/home/amiculas/ran
2024-10-29T16:30:26Z    INFO    "--format spdx-json" disables security scanning. Specify "--scanners vuln" explicitly if you want to include vulnerabilities in the "spdx-json" report.
2024-10-29T16:30:26Z    INFO    Number of language-specific files       num=1
{
  "spdxVersion": "SPDX-2.3",
  "dataLicense": "CC0-1.0",
  "SPDXID": "SPDXRef-DOCUMENT",
  "name": "/proc/117851/root/home/amiculas/ran",
  "documentNamespace": "http://aquasecurity.github.io/trivy/filesystem//proc/117851/root/home/amiculas/ran-bf738818-b10b-4ec1-83db-1fe6a9c7cd70",
  "creationInfo": {
    "creators": [
      "Organization: aquasecurity",
      "Tool: trivy-0.56.2"
    ],
    "created": "2024-10-29T16:30:26Z"
  },
  "packages": [
    {
      "name": "ran",
      "SPDXID": "SPDXRef-Application-652ea8ed26a8b19",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "attributionTexts": [
        "Class: lang-pkgs",
        "Type: gobinary"
      ],
      "primaryPackagePurpose": "APPLICATION"
    },
    {
      "name": "github.com/abbot/go-http-auth",
      "SPDXID": "SPDXRef-Package-6bced50b61582bca",
      "versionInfo": "v0.4.0",
      "supplier": "NOASSERTION",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "sourceInfo": "package found in: ran",
      "licenseConcluded": "NOASSERTION",
      "licenseDeclared": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:golang/github.com/abbot/[email protected]"
        }
      ],
      "attributionTexts": [
        "PkgType: gobinary"
      ],
      "primaryPackagePurpose": "LIBRARY"
    },
    {
      "name": "github.com/m3ng9i/go-utils",
      "SPDXID": "SPDXRef-Package-3142a384896121f8",
      "versionInfo": "v0.0.0-20160811013010-f9b7dc669fde",
      "supplier": "NOASSERTION",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "sourceInfo": "package found in: ran",
      "licenseConcluded": "NOASSERTION",
      "licenseDeclared": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:golang/github.com/m3ng9i/[email protected]"
        }
      ],
      "attributionTexts": [
        "PkgType: gobinary"
      ],
      "primaryPackagePurpose": "LIBRARY"
    },
    {
      "name": "github.com/m3ng9i/ran",
      "SPDXID": "SPDXRef-Package-d371b45954284092",
      "versionInfo": "v0.1.6",
      "supplier": "NOASSERTION",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "sourceInfo": "package found in: ran",
      "licenseConcluded": "NOASSERTION",
      "licenseDeclared": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:golang/github.com/m3ng9i/[email protected]"
        }
      ],
      "attributionTexts": [
        "PkgType: gobinary"
      ],
      "primaryPackagePurpose": "LIBRARY"
    },
    {
      "name": "github.com/oxtoacart/bpool",
      "SPDXID": "SPDXRef-Package-61d659ef30bff13d",
      "versionInfo": "v0.0.0-20190530202638-03653db5a59c",
      "supplier": "NOASSERTION",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "sourceInfo": "package found in: ran",
      "licenseConcluded": "NOASSERTION",
      "licenseDeclared": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:golang/github.com/oxtoacart/[email protected]"
        }
      ],
      "attributionTexts": [
        "PkgType: gobinary"
      ],
      "primaryPackagePurpose": "LIBRARY"
    },
    {
      "name": "golang.org/x/crypto",
      "SPDXID": "SPDXRef-Package-1882a12b82ac17bc",
      "versionInfo": "v0.0.0-20190308221718-c2843e01d9a2",
      "supplier": "NOASSERTION",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "sourceInfo": "package found in: ran",
      "licenseConcluded": "NOASSERTION",
      "licenseDeclared": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:golang/golang.org/x/[email protected]"
        }
      ],
      "attributionTexts": [
        "PkgType: gobinary"
      ],
      "primaryPackagePurpose": "LIBRARY"
    },
    {
      "name": "golang.org/x/net",
      "SPDXID": "SPDXRef-Package-3d35ed607a647f31",
      "versionInfo": "v0.0.0-20190724013045-ca1201d0de80",
      "supplier": "NOASSERTION",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "sourceInfo": "package found in: ran",
      "licenseConcluded": "NOASSERTION",
      "licenseDeclared": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:golang/golang.org/x/[email protected]"
        }
      ],
      "attributionTexts": [
        "PkgType: gobinary"
      ],
      "primaryPackagePurpose": "LIBRARY"
    },
    {
      "name": "stdlib",
      "SPDXID": "SPDXRef-Package-a77035723e5d3079",
      "versionInfo": "1.22.2",
      "supplier": "NOASSERTION",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "sourceInfo": "package found in: ran",
      "licenseConcluded": "NOASSERTION",
      "licenseDeclared": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:golang/[email protected]"
        }
      ],
      "attributionTexts": [
        "PkgType: gobinary"
      ],
      "primaryPackagePurpose": "LIBRARY"
    },
    {
      "name": "/proc/117851/root/home/amiculas/ran",
      "SPDXID": "SPDXRef-Filesystem-84989ddda734b091",
      "downloadLocation": "NONE",
      "filesAnalyzed": false,
      "attributionTexts": [
        "SchemaVersion: 2"
      ],
      "primaryPackagePurpose": "SOURCE"
    }
  ],
  "relationships": [
    {
      "spdxElementId": "SPDXRef-Application-652ea8ed26a8b19",
      "relatedSpdxElement": "SPDXRef-Package-d371b45954284092",
      "relationshipType": "CONTAINS"
    },
    {
      "spdxElementId": "SPDXRef-DOCUMENT",
      "relatedSpdxElement": "SPDXRef-Filesystem-84989ddda734b091",
      "relationshipType": "DESCRIBES"
    },
    {
      "spdxElementId": "SPDXRef-Filesystem-84989ddda734b091",
      "relatedSpdxElement": "SPDXRef-Application-652ea8ed26a8b19",
      "relationshipType": "CONTAINS"
    },
    {
      "spdxElementId": "SPDXRef-Package-d371b45954284092",
      "relatedSpdxElement": "SPDXRef-Package-1882a12b82ac17bc",
      "relationshipType": "DEPENDS_ON"
    },
    {
      "spdxElementId": "SPDXRef-Package-d371b45954284092",
      "relatedSpdxElement": "SPDXRef-Package-3142a384896121f8",
      "relationshipType": "DEPENDS_ON"
    },
    {
      "spdxElementId": "SPDXRef-Package-d371b45954284092",
      "relatedSpdxElement": "SPDXRef-Package-3d35ed607a647f31",
      "relationshipType": "DEPENDS_ON"
    },
    {
      "spdxElementId": "SPDXRef-Package-d371b45954284092",
      "relatedSpdxElement": "SPDXRef-Package-61d659ef30bff13d",
      "relationshipType": "DEPENDS_ON"
    },
    {
      "spdxElementId": "SPDXRef-Package-d371b45954284092",
      "relatedSpdxElement": "SPDXRef-Package-6bced50b61582bca",
      "relationshipType": "DEPENDS_ON"
    },
    {
      "spdxElementId": "SPDXRef-Package-d371b45954284092",
      "relatedSpdxElement": "SPDXRef-Package-a77035723e5d3079",
      "relationshipType": "DEPENDS_ON"
    }
  ]
}

ariel-miculas avatar Oct 29 '24 18:10 ariel-miculas

Hey @ariel-miculas -- are you able to use the --base-path flag? If you perform a directory scan with Syft, you should be able to set the directory for syft to resolve symlinks relative to. So, if you used --base-path /proc/117851/root, when Syft encounters a symlink pointing to /, for example, Syft should resolve this to /proc/117851/root/ and /home should resolve to /proc/117851/root/home. Would using this option resolve the issue for you?

kzantow avatar Oct 29 '24 18:10 kzantow

Thanks, @kzantow , --base-path looks promising, unfortunately it doesn't work:

❯ docker run --rm --cap-add CAP_SYS_ADMIN --cap-add CAP_SYS_PTRACE --pid=host --security-opt apparmor=unconfined anchore/syft scan --base-path /proc/117851/root /proc/117851/root/home/amiculas/ran


unable to get file resolver: unable to get absolute path for analysis path="/proc/117851/root/home/amiculas/ran": lstat /home: no such file or directory

If I had to guess, syft tries to convert the base path to an absolute path (which results in /), which doesn't work for the same reasons I've outlined previously.

ariel-miculas avatar Oct 29 '24 19:10 ariel-miculas

That's unfortunate -- from my understanding, --base-path should do what you want here. I'm trying to boil the problem down to an example that doesn't really need mounts or running within a container and such involved, but I feel as though I'm missing something.

Given this directory structure:

/somewhere/root/subdir1
/somewhere/root/subdir1/link -> /subdir2
/somewhere/root/subdir2
/somewhere/root/subdir2/something-to-scan

e.g.:

$ ls -alFR                          
total 0
drwxr-xr-x  4 kzantow  staff   128B Oct 29 15:35 ./
drwxr-xr-x  4 kzantow  staff   128B Oct 29 15:26 ../
drwxr-xr-x  3 kzantow  staff    96B Oct 29 15:36 subdir1/
drwxr-xr-x  3 kzantow  staff    96B Oct 29 15:26 subdir2/

./subdir1:
total 0
drwxr-xr-x  3 kzantow  staff    96B Oct 29 15:36 ./
drwxr-xr-x  4 kzantow  staff   128B Oct 29 15:35 ../
lrwxr-xr-x  1 kzantow  staff     8B Oct 29 15:36 link@ -> /subdir2

./subdir2:
total 88688
drwxr-xr-x  3 kzantow  staff    96B Oct 29 15:26 ./
drwxr-xr-x  4 kzantow  staff   128B Oct 29 15:35 ../
-rwxr-xr-x  1 kzantow  staff    43M Oct 29 15:18 syft*

If I run Syft against subdir1 without a --base-path specified, it fails to catalog anything because it attempts to resolve link to /subdir2 which doesn't exist on the root of my filesystem:

$ syft subdir1              
No packages discovered

If I run with --base-path ., which has the subdir2, it works as expected and correctly resolve the symlink to an alternate root location:

$ syft subdir1 --base-path .
NAME                                                           VERSION                               TYPE        
dario.cat/mergo                                                v1.0.0                                go-module    
...
github.com/anchore/syft                                        v1.11.1                               go-module    

If I run directly against the link with or without --base-path, I get a file not found because this is not resolving the link against the base-path, only my root filesystem, I would also guess.

$ syft subdir1/link --base-path .
...
  - additionally, the following providers failed with file does not exist: docker-archive, oci-archive, oci-dir, singularity, oci-dir, local-file, local-directory

This last case is effectively the same issue you have, correct? If so, I think we could just fix the initial directory path lookup to honor the --base-path. Or have I missed more important details here?

kzantow avatar Oct 29 '24 19:10 kzantow

The problem is you cannot call NormalizeBaseDirectory on /proc/117851/root/home/amiculas/ran.

func NormalizeBaseDirectory(base string) (string, error) {
	if base == "" {
		return "", nil
	}

	cleanBase, err := filepath.EvalSymlinks(base)
	if err != nil {
		return "", fmt.Errorf("could not evaluate base=%q symlinks: %w", base, err)
	}

	return filepath.Abs(cleanBase)
}

The reason is that reading the symlink /proc/117851/root yields /:

$ readlink /proc/117851/root
/

So the base path is translated to /, instead of being left untouched as /proc/117851/root.

And this / symlink is also what the docker container sees:

root@8120e5ee4722:/app# readlink /proc/117851/root
/

This is a lie, because ls / gives you different results than ls /proc/117851/root when your process is running in a different mount namespace than the process with PID 117851. It's just how procfs works.

ariel-miculas avatar Oct 29 '24 20:10 ariel-miculas

This last case is effectively the same issue you have, correct? If so, I think we could just fix the initial directory path lookup to honor the --base-path. Or have I missed more important details here?

This is one of the issues, yes. --base-path only works with directories, not with files. The FileResolver from file_source.go doesn't account for the base, it passes absParentDir for both the root and the base:

		res, err = fileresolver.NewFromDirectory(absParentDir, absParentDir, exclusionFunctions...)
		if err != nil {
			return nil, fmt.Errorf("unable to create directory resolver: %w", err)
		}

The second issue is that we shouldn't call filepath.EvalSymlinks(base) in NormalizeBaseDirectory (or at least don't call this for a /proc/PID/root path). In fact, every call to filepath.EvalSymlinks() needs to be replaced with procfsroot.EvalSymlinks() from procfsroot.

ariel-miculas avatar Oct 30 '24 14:10 ariel-miculas

This is an early POC for my feature request: https://github.com/anchore/syft/compare/main...ariel-miculas:syft:allow-scanning-files-in-mount-namespaces It allows me to do:

❯ sudo env PATH=$PATH go run cmd/syft/main.go scan --base-path=/proc/240346/root file:/proc/240346/root/usr/local/bin/ran
 ✔ Indexed file system                                                                                                                                                                                     /proc/240346/root/usr/local/bin
 ✔ Cataloged contents                                                                                                                                                     6758f234c5a392ddfdc0bb70898131ac6cb414df35ee5d214e905b3469c09ce4
   ├── ✔ Packages                        [7 packages]
   ├── ✔ File digests                    [0 files]
   ├── ✔ File metadata                   [0 locations]
   └── ✔ Executables                     [1 executables]
NAME                           VERSION                             TYPE
github.com/abbot/go-http-auth  v0.4.0                              go-module
github.com/m3ng9i/go-utils     v0.0.0-20160811013010-f9b7dc669fde  go-module
github.com/m3ng9i/ran          v0.1.6                              go-module
github.com/oxtoacart/bpool     v0.0.0-20190530202638-03653db5a59c  go-module
golang.org/x/crypto            v0.0.0-20190308221718-c2843e01d9a2  go-module
golang.org/x/net               v0.0.0-20190724013045-ca1201d0de80  go-module
stdlib                         go1.17                              go-module

where 240346 is the PID of a shell running inside a docker container.

I would like some early feedback on this if possible.

ariel-miculas avatar Oct 30 '24 17:10 ariel-miculas

Let me step back and try to clarify the issues with syft:

1. Honour the base-path flag for file sources

First of all, the --base-path flag doesn't work with file sources, i.e. file:path/to/file.

2. Do symbolic link resolutions depending on whether the directories are in the base-path or not

Secondly, considering the directory structure you're mentioned:

/somewhere/root/subdir1
/somewhere/root/subdir1/link -> /subdir2
/somewhere/root/subdir2
/somewhere/root/subdir2/something-to-scan

let's add a symlink to /somewhere/root/subdir1:

/elsewhere/link-to-subdir-1 -> /somewhere/root/subdir1

When we run syft /elsewhere/link-to-subdir-1/link --base-path /somewhere/root, syft needs to do the following:

  • do regular symlink resolution for every component outside the base-path
  • do symlink resolution relative to the base-path for every component inside the base-path
  • make sure symlinks inside base-path don't escape the base-path by means of relative symbolic links and ../.. patterns

For the example mentioned above, the first symlink encountered is /elsewhere/link-to-subdir1; it's not in the base-path /somewhere/root, so it gets resolved to /somewhere/root/subdir1; the next component is link, which gets concatenated with the previously resolved path /somewhere/root/subdir1; the full path is now /somewhere/root/subdir1/link; we need to resolve the symbolic link, but we are inside the base-path /somewhere/root, so we resolve the symlink relative to the base-path, giving us the full path: /somewhere/root/subdir2

3. Don't do symbolic link resolution for the base-path in case of /proc/PID/root base paths

Since typically /proc/PID/root will be a symlink to /, doing the symbolic link resolution means that we'll be scanning paths in our own mount namespace, which is not what we want.

4. Don't ignore procfs files if the user explicitly passed in a path like /proc/PID/root/....

Right now, newDirectoryIndexer applies the following path filters prior to any other path filter:

  • requireFileInfo
  • disallowByFileType
  • skipPathsByMountTypeAndName

So if the user passes a file in another mount namespace, it will be ignored since skipPathsByMountTypeAndName ignores procfs entries:

func newPathSkipperFromMounts(root string, infos []*mountinfo.Info) pathSkipper {
	// we're only interested in ignoring the logical filesystems typically found at these mount points:
	// - /proc
	//     - procfs
	//     - proc

@kzantow does my assessment look right to you?

ariel-miculas avatar Nov 01 '24 10:11 ariel-miculas

Seeing this same issue and would love to be able to scan host files that are mounted while Syft runs in a docker container.

Chris-BMA avatar May 30 '25 15:05 Chris-BMA

I've wrapped up some WIP and am going through old PRs and their respective issues. We'll definitely be picking this one back up on Livestream this week to try and get it over the line and see what's missing

spiffcs avatar Aug 12 '25 00:08 spiffcs