klee.github.io
klee.github.io copied to clipboard
Website doesn't easily answer: "What can KLEE do for me?"
Problem: It is difficult for readers who don't know what a "dynamic symbolic execution engine" is to figure out what KLEE can do, what it is useful for, and whether or not they would benefit from using it.
Explanation The current header on http://klee.github.io/ reads:
KLEE is a dynamic symbolic execution engine built on top of the LLVM compiler infrastructure, and available under the UIUC open source license. For more information on what KLEE is and what it can do, see the OSDI 2008 paper.
At this point a reader (probably) knows KLEE is a programmer's tool. If they doen't know what a "Dynamic Symbolic Execution Engine" they have been encouraged to read a block of unstyled text which has a few helpful snippets buried in it:
We present a new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs.
We also used KLEE as a bug finding tool, applying it to 452 applications (over 430K total lines of code), where it found 56 serious bugs
This tells the reader that KLEE is a programmer's tool that generates tests to find bugs in code. From here, they can click through to a 16-page academic paper, which explains how KLEE works and describes a few case studies where it was used to find bugs in C code. This document is obviously not written for non-academic programmers, and does not readily answer "What can KLEE do for me?"
Going back to http://klee.github.io/, there are several other prevalent and possibly-helpful-sounding links:
- Getting Started and Documentation describe how to install, run, and use KLEE, but assume the reader already know they want to do these things.
- Tutorials, the first of which says that KLEE will "exercise all paths through the code" in a simple C program, and that the program runs "as expected". Most of the tutorials are hosted on other websites and written by other people.
- Projects, a list of ongoing improvements to KLEE. This page does helpfully list some things KLEE doesn't currently work with.
- Getting Involved is for people who want to become KLEE developers, not KLEE users.
-
Run small examples in your browser. All three examples are in C. The first example finds no bugs. The second example prints (at the bottom of 6578 warnings!) two
memory error on line XX
s with no further explanation about what kind of errors these are.
Proposed Solution Compose a description of what KLEE does and is useful for, and either include it on http://klee.github.io/, or on a page which is prominently linked there.
The second tutorial has, in the middle, this very helpful list of things KLEE can and can't do:
Some types of errors KLEE detects include:
- ptr: Stores or loads of invalid memory locations.
- free: Double or invalid free().
- abort: The program called abort().
- assert: An assertion failed.
- div: A division or modulus by zero was detected.
- user: There is a problem with the input (invalid klee intrinsic calls) or the way KLEE is being used.
- exec: There was a problem which prevented KLEE from executing the program; for example an unknown instruction, a call to an invalid function pointer, or inline assembly.
- model: KLEE was unable to keep full precision and is only exploring parts of the program state. For example, symbolic sizes to malloc are not currently supported, in such cases KLEE will concretize the argument.
Some of the external tutorials have pretty decent brief descriptions of what KLEE is and does. They're good examples of what I think should be on http://klee.github.io/
-
KLEE is a symbolic execution engine that can be used to automate test-case generation as well as be used to find bugs.
-
KLEE is a symbolic interpreter of LLVM bitcode. It runs code compiled/assembled into LLVM symbolically. That’s running a program considering its input(or some other variables) to be symbols instead of concrete values like 100 or “cacho”. In very few words, a symbolic execution runs through the code propagating symbols and conditions; forking execution at symbol dependant branches and asking the companion SMT solver for path feasibility or counter-examples. For more info on this check out this, this, or even this.
-
Keygenning with KLEE and Hex-Rays isn't exactly brief, but it has a very helpful list of strengths and weaknesses of symbolic execution as a means of finding bugs:
strengths:
- when a test case fails, the program is proven to be incorrect;
- automatic test cases catch errors that often are overlooked in manual written test cases (this is from KLEE paper);
- when it works it's cool :) (and this is from Jérémy);
weaknesses:
- when no tests fail we are not sure everything is correct, because no proof of correctness is given; static analysis can do that when it works (and often it does not!);
- covering all the paths is not enough, because a variable can hold different values in one path and only some of them cause a bug;
- complete coverage for non trivial programs is often impossible, due to path explosion or constraint solver timeout;
- scaling is difficult, and execution time of the engine can suffer;
- undefined behavior of CPU could lead to unexpected results;
- ... and maybe there are a lot more remarks to add.
Nearly all of the tutorials and examples are about using KLEE to study C code compiled by Clang. I was starting to wonder if this was the only LLVM frontend that KLEE worked with, when I found two other examples:
- Rust's propverify tool uses KLEE, and you can also run KLEE directly on Rust
- KLEE can test interpreters (in this case a regex engine) written in C (page 155, "6.1 KLEE")
Is KLEE intended primarily to be used against C code, or is it intended to work with any language that can be compiled by LLVM? This should also be addressed by the proposed addition.
@AVHon we are of course happy to accept improvements to the KLEE website. I don't agree with all the suggestions, but I am happy to discuss them in the context of specific PRs.