advisory-db icon indicating copy to clipboard operation
advisory-db copied to clipboard

Would you consider a new informational advisory class, "distributes-executable"?

Open mulkieran opened this issue 1 year ago • 46 comments

The category would be something like "distributes-executable". This came up for me recently because of the issue here: https://github.com/serde-rs/serde/issues/2538, which caused me difficulty in packaging. The particular difficulty here was that our project distributes a vendor tarfile upstream. Then, downstream, the tarfile is used by the packaging system. It would have been helpful to be forewarned about what was going to turn out to be a packaging problem downstream. As it was, the remediation had to occur in the downstream packaging, (or there would have to be a new upstream release).

mulkieran avatar Aug 03 '23 15:08 mulkieran

This seems to go beyond the scope of what's appropriate for a security advisory DB IMO.

On Thu, Aug 3, 2023 at 11:41 AM mulkieran @.***> wrote:

The category would be something like "distributes-executable". This came up for me recently because of the issue here: serde-rs/serde#2538 https://github.com/serde-rs/serde/issues/2538, which caused me difficulty in packaging. The particular difficulty here was that our project distributes a vendor tarfile upstream. Then, downstream, the tarfile is used by the packaging system. It would have been helpful to be forewarned about what was going to turn out to be a packaging problem downstream. As it was, the remediation had to occur in the downstream packaging, (or there would have to be a new upstream release).

— Reply to this email directly, view it on GitHub https://github.com/rustsec/advisory-db/issues/1737, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAGBG6JXFEABFJKUQJXMDXTPBDLANCNFSM6AAAAAA3C7BIP4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- All that is necessary for evil to succeed is for good people to do nothing.

alex avatar Aug 03 '23 15:08 alex

For examples of additional categories we might consider adding, you can look at the CWE database: https://cwe.mitre.org/

tarcieri avatar Aug 03 '23 15:08 tarcieri

This seems to go beyond the scope of what's appropriate for a security advisory DB IMO.

advisory-db has a category "unmaintained", which I think goes even further beyond security than my proposed category. I think its mission has already crept, witness the "unmaintained" category, and it might be reasonable for it to creep in this direction, also.

mulkieran avatar Aug 03 '23 15:08 mulkieran

unmaintained isn't a "category", it's a class of informational advisory, which is a non-security advisory

tarcieri avatar Aug 03 '23 16:08 tarcieri

unmaintained isn't a "category", it's a class of informational advisory, which is a non-security advisory

Thanks! I've updated the title appropriately.

mulkieran avatar Aug 03 '23 16:08 mulkieran

The existing informational = "notice" class may be more appropriate for this use case

tarcieri avatar Aug 03 '23 17:08 tarcieri

Ok! I've given that a try.

mulkieran avatar Aug 04 '23 01:08 mulkieran

This seems to go beyond the scope of what's appropriate for a security advisory DB IMO.

I disagree; downloading some binary code from the internet and running it during build without the user’s knowledge and consent is a serious security issue that opens the door to various attacks.

jirutka avatar Aug 12 '23 23:08 jirutka

I've strongly disagree with the decision to close this. Distributing an executable that is run without the user knowing is inherently a security vulnerability. Doubly so when the executable cannot be reproduced, as is the case with serde.

jhpratt avatar Aug 18 '23 22:08 jhpratt

This is being actively discussed on the PR adding the actual advisory: https://github.com/rustsec/advisory-db/pull/1738

Reopening since there is actually work being done on an instance of this, which may set a precedent.

Shnatsel avatar Aug 18 '23 22:08 Shnatsel

Thanks for reopening. I think this is a worthwhile proposal.

Does the executable in "distributes-executable" refer specifically to "a file that can immediately be executed", or generally to "artifact containing executable machine code"?

For example which of the following would it be useful for such a notice to cover?—

  1. A shell script containing ASCII shell code, with executable bit set, run at build time
  2. A precompiled dynamically linked .so / .dll that gets linked into the target artifact, or into a build script or macro (https://docs.rs/crate/windows_x86_64_msvc/0.48.5/source/lib/windows.0.48.5.lib)
  3. Any other non-human-readable artifact with security ramification potential

From @mulkieran's description at the very top of this issue, it's not clear which one might have been their immediate motivation. The discussion of vendor tarfiles makes it sound like number 1 is the closest, and would have been equally problematic and worth notifying about?

Others particularly on the PR seem more concerned with number 3.

Obviously number 2 has been standard practice for a long time.

If "distributes-executable" is intended to encompass more than one of these, it would be a good idea to affirm the choice to use a single advisory class for them all, or clarify the names to distinguish what different people are interested in knowing.

dtolnay avatar Aug 19 '23 03:08 dtolnay

I'd appreciate an informational advisory for any non-transparent piece of code placed on the system. This would not require it being labelled executable, yet would include everything from executables to libraries to base64-encoded bash in a non-executable txt file.

kayabaNerve avatar Aug 19 '23 05:08 kayabaNerve

  1. A precompiled dynamically linked .so / .dll that gets linked into the target artifact, or into a build script or macro (https://docs.rs/crate/windows_x86_64_msvc/0.48.5/source/lib/windows.0.48.5.lib)

The .lib files used in the windows crates aren't executable code (that is an annoyingly different type of .lib), they're just import libraries that help the linker. Basically just the equivalent of Apple's .tbd definition files as another reference point. So vendoring something like it doesn't pose any theoretical risks to runtime execution, unlike a staticlib .lib or a .dll. Throw it into IDA for example, there's no executable code inside. Due to all that, windows is not really the best example of including precompiled libraries in the ecosystem.

BlackHoleFox avatar Aug 19 '23 05:08 BlackHoleFox

I'm told that is not a useful distinction. One must do work to ascertain "doesn't pose any theoretical risks to runtime execution", because that is not possible to tell from the crate's human-readable source code in build.rs or lib.rs.


Screenshot from 2023-08-18 23-18-01

dtolnay avatar Aug 19 '23 06:08 dtolnay

That said, it may be possible to automate scanning lib files to make sure they're not doing anything more than defining imports.

ChrisDenton avatar Aug 19 '23 10:08 ChrisDenton

The purpose of the informational advisory would be to call attention that setting up such automation is necessary for this crate. No different from precompiled executable code where you may want automation to recompile from source and compare.

dtolnay avatar Aug 19 '23 14:08 dtolnay

I wonder if a database is even a good solution for this.

We have a security advisory database because it's currently infeasible to reliably detect security issues in an automated manner. By contrast, distributing an executable should be reasonably easy to detect automatically and alert the users without waiting on humans to notice it and file an advisory. Human-filed advisories also invariably miss more obscure crates that humans didn't bother reviewing.

I believe an automated tool that detect binary blobs of executable types in the downloaded code is a better solution for this problem than a database of known instances of it. https://github.com/EmbarkStudios/cargo-deny springs to mind.

Shnatsel avatar Aug 19 '23 14:08 Shnatsel

I believe an automated tool that detect binary blobs of executable types in the downloaded code is a better solution for this problem than a database of known instances of it. https://github.com/EmbarkStudios/cargo-deny springs to mind.

See https://github.com/EmbarkStudios/cargo-deny/issues/43

Nemo157 avatar Aug 19 '23 14:08 Nemo157

For examples of additional categories we might consider adding, you can look at the CWE database: https://cwe.mitre.org/

Some CWEs that are possibly relevant:

joshka avatar Aug 20 '23 02:08 joshka

CWE-348 sounds exactly like the serde discussion.

kayabaNerve avatar Aug 20 '23 03:08 kayabaNerve

Here's the relevant Fedora policy:

https://docs.fedoraproject.org/en-US/packaging-guidelines/what-can-be-packaged/#prebuilt-binaries-or-libraries

(Other distributions have roughly similar policies and cultural tendencies.)

This matches when I would expect to see an advisory.

traviscross avatar Aug 20 '23 05:08 traviscross

I wonder if a database is even a good solution for this.... I believe an automated tool that detect binary blobs of executable types in the downloaded code is a better solution for this problem than a database of known instances of it. https://github.com/EmbarkStudios/cargo-deny springs to mind.

Automation is always good, but there are many ways for pre-built binary code to be included (even non-maliciously) that are difficult to check for automatically. Also, if the local policy is to reject pre-built binaries and to abort if the dependency tree would contain them, it's preferable to do this before downloading the problematic binaries.

The hope, I think, is that this practice is or will become uncommon enough that maintaining a database is not too onerous.

traviscross avatar Aug 20 '23 08:08 traviscross

I disagree; downloading some binary code from the internet and running it during build without the user’s knowledge and consent is a serious security issue that opens the door to various attacks.

Looking closer into the issue, I don't think that's what is happening here? If I understand the code in lib_precompiled.rs correctly, the binary is already part of the crate package. That's not great, but is is a lot better than if the binary was downloaded from the internet -- this way, the binary is part of the integrity checking (with checksums in lockfiles) and we can be sure all of us are getting the same binary. That's worth a lot.

So the issue really is that we cannot inspect what that binary does. "distributes-executable" doesn't quite capture that since arguably if it was a Python script, having the build script run that isn't all that different from just executing the build script itself. "distributes-executable-without-source" or "distributes-unauditable-executable" or so captures the issue better.

RalfJung avatar Aug 20 '23 14:08 RalfJung

@RalfJung please bare with me as i havent had the time to totally analyse the issue:

IIUC, the binary is embedded directly into the .crate file that is uploaded to crates.io. unlike traditional Rust dependencies, where code can be validated from either its repository or from its local distribution (unpacked .crate in ~/.cargo), in the serde_derive case, the source can only be verified online. there's no way to verify that serde is actually using the code that appears in the github repository and not some secret hack that publishes some other code to crates.io. we just have to trust @dtolnay and their password protection to not fuck us completely.

soqb avatar Aug 20 '23 15:08 soqb

@soqb that is why people are discussing the feasibility of reproducible builds. You can verify by running the same steps and getting the same output.

ChrisDenton avatar Aug 20 '23 16:08 ChrisDenton

@soqb that is why people are discussing the feasibility of reproducible builds. You can verify by running the same steps and getting the same output.

i think checksums are too volatile for this. on a local machine the environment might just be too different to reasonably reproduce the exact same binary. (is this true?- i don't really have any knowledge in this area)

users could theoretically use manual methods to test for every vulnerability/attack surface in the database. this project is simply a best-effort trusted global cache of security concerns.

soqb avatar Aug 20 '23 16:08 soqb

IIUC, the binary is embedded directly into the .crate file that is uploaded to crates.io.

Yes. Exactly like the source code for the derive macro.

there's no way to verify that serde is actually using the code that appears in the github repository

Correct. And I said exactly that in my comment.

But above someone claimed that serde would be "downloading some binary code from the internet and running it during build". That is definitely not happening. The only download that is happening is cargo downloading the crate file from crates.io, same as it does for all other crates. All code that is being run was downloaded (and its checksum verified) by cargo, same as for all other crates. I was merely correcting that false statement (which stood uncontested as far as I can see). I do still agree that there is a problem here, but the problem is less big than I originally thought.

Even if we don't like what serde is doing, we surely shouldn't spread or accept false claims about what it is doing.

RalfJung avatar Aug 20 '23 16:08 RalfJung

Reviewing these CWEs I don't think any apply:

CWE-829: Inclusion of Functionality from Untrusted Control Sphere

This seems to be about embedding 3rd party code that winds up as trusted as the first-party program, and that was unexpected (i.e. the third party code wasn't properly reviewed or intended for that safety domain). But in this case, there is no third party code as both the source and binary originate with the author.

CWE-494: Download of Code Without Integrity Check The product downloads source code or an executable from a remote location and executes the code without sufficiently verifying the origin and integrity of the code.

The binary is distributed with the crate and its integrity is checked in the same manner as the source code (SHA-256 digest in Cargo.lock)

CWE-348: Use of Less Trusted Source

The example is using an untrusted HTTP header instead of a trusted one. I don't think this applies.

tarcieri avatar Aug 20 '23 17:08 tarcieri

@RalfJung am i misunderstanding it, or is the checksum (at least sometimes) generated by crates.io based on purely the .crate that is uploaded? IIUC, the scope of the checksum is limited to verifying the continuous integrity of downloads from crates.io and makes no guarantee about how accurate crates.io is to the crate's code in its public repository.

soqb avatar Aug 20 '23 18:08 soqb

@soqb https://docs.rs allows you to view/link to canonical crate source code.

GitHub is a 3rd party service so it’s not possible to control what code is hosted there.

tarcieri avatar Aug 20 '23 18:08 tarcieri