coreutils Start fuzz testing

We should start fuzzing to find security vulnerabilities at some point. It may be easier to do this after #1124 is done.

Mar 06 '18 01:03 Arcterus

Personal recommendations:

consider using https://github.com/AltSysrq/proptest
consider creating a posix-specific fuzzing crate with proptest utility types, so that some generalization is possible across all the utilities. For example
- "table-like" text output with restricted values for rows, header, etc.
- system-file specific fuzzer, i.e. generalize format of /proc/*/* items.
- etc

Aug 05 '19 23:08 vitiral

I'm working on getting the GNU Coreutils scripts to test the Rust binaries. Before fuzzing I would like to formalize the parsers to remove shotgun parse bugs that might be lurking. Several fuzz rigs for coreutils binaries already exist: https://scholar.google.com/scholar?q=coreutils+fuzz&hl=en&as_sdt=0,16

I would also like to strace diff between GNU coreutils binaries and this project. Those differences need sifted through to eliminate security issues.

Aug 02 '20 22:08 chadbrewbaker

This is choice. In addition to routine fuzzing, there are also scores of environment variables to fuzz if we support them all: https://github.com/uutils/coreutils/issues/1582

Aug 05 '20 01:08 chadbrewbaker

Something that might be of interest is https://github.com/dyninst/fuzz which was written for an academic paper (ftp://ftp.cs.wisc.edu/paradyn/technical_papers/fuzz2020.pdf). It found problems in fold (#1680 ), comm (I think just an inability to handle non valid utf-8 inputs) and ptx (I haven’t looked into that one yet). Though it’s limited to checking for hangs/crashes and wouldn’t be a good fit for automated testing due to needing a lot of time to generate test cases up front.

Jan 04 '21 20:01 jaggededgedjustice

wouldn’t be a good fit for automated testing due to needing a lot of time to generate test cases up front.

Sure but we could use https://github.com/google/oss-fuzz for on continuous testing.

Jan 04 '21 20:01 sylvestre

I haven't dealt with testing much in my life, but I'm increasingly fascinated by fuzz testing and would love to have some "real" project (nothing academic) to pick up on it. I started reading the Fuzzing Book a while ago in case that helps.

Should one try to integrate the rust coreutils in some of the existing fuzzing setups that @chadbrewbaker mentioned, or do something from scratch?

Aug 25 '21 13:08 Funky185540

This book is indeed a great start.

See: https://google.github.io/oss-fuzz/getting-started/new-project-guide/rust-lang/

Some examples of some rust crates: https://github.com/google/oss-fuzz/blob/c25ca964c4363e6bcd5eab86c7d42d1a305d6a97/projects/flate2-rs/project.yaml https://github.com/google/oss-fuzz/tree/30f3a8f1c0f5b072e77d5bea82709db04c53453d/projects/mp4parse-rust

And https://github.com/google/oss-fuzz/pull/118

Aug 25 '21 13:08 sylvestre

And the effort of @alastairreid at Google Research:

https://project-oak.github.io/rust-verification-tools/2021/07/14/coreutils.html

Aug 25 '21 13:08 sylvestre

Impressive, a finished step-by-step tutorial... I'll look into that, thanks a lot!

Aug 25 '21 13:08 Funky185540

I think the status of my work is as follows... It's a useful start to show that symbolic execution tools can be used and can catch bugs on code of the scale/complexity of coreutils. But, at the moment you will probably get more out of fuzzing. Using PropTest (mentioned above) is a good way to do this because our tools reimplement the PropTest API (although that is not shown in my tutorial)

I think this can/will change in the future though so it is worth keeping an eye on

Hybrid fuzzers that combine symbolic execution with fuzzing: fuzzing 90% of the time but using symbolic execution to hit code that fuzzing is struggling to cover
Faster symbolic execution tools like SymCC. (I have not had a chance to try SymCC yet.)
Tools like KLEE (the one I used) and RMC. (RMC builds on the mature, solid CBMC tool. RMC is probably not yet usable on CoreUtils but it is improving very rapidly.)

The good news is that the basic steps of generating LLVM bitcode files are common to several of these (except RMC) - so many of the steps in the tutorial will more or less just work whichever tools you use.

Aug 26 '21 09:08 alastairreid

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Aug 31 '22 06:08 stale[bot]

The blocking issue was a horribly broken ARM build chain that has been fixed in Ubuntu 22. I’ll try to build Alastair’s toolchain again this week.

Aug 31 '22 14:08 chadbrewbaker

On the same subject, this issue is very interesting too: https://github.com/uutils/coreutils/issues/3785

Aug 31 '22 14:08 sylvestre

Current issue title is "Start fuzz testing". From #5343 do I understand that fuzz testing has been started.

My actual question: "Should this issue be closed?"

Oct 09 '23 17:10 stappersg

Thanks for the hint, closing this ticket.

Oct 10 '23 05:10 cakebaker

@stappersg you are everywhere ;)

Oct 10 '23 11:10 sylvestre

On Tue, Oct 10, 2023 at 04:21:53AM -0700, Sylvestre Ledru wrote:

@stappersg you are everywhere ;)

(-: indeed, I should focus more

Groeten Geert Stappers

Silence is hard to parse

Oct 10 '23 11:10 stappersg

coreutils coreutils copied to clipboard

Start fuzz testing

Groeten Geert Stappers

coreutils
coreutils copied to clipboard