clang-mutate
clang-mutate copied to clipboard
Manipulate C-family ASTs with Clang
clang-mutate: Manipulate C-family ASTs with Clang
This tool provides an interface for performing operations on C and C++ source files. Using the clang LibTooling interface, it obtains an abstract syntax tree (AST) representation of the source and provides enables the inspection or alteration the source ASTs.
Installation
clang-mutate is dependent on
clang/llvm version 6.0.X.
Precompiled binaries may be downloaded for the platform of your choice;
once downloaded, ensure clang is on your $PATH.
Beyond clang, we use zlib and tinfo. Additionally, we use
pandoc to generate documentation. These
tools may be installed via the package manager of your choice.
On Debian systems, sudo apt-get install libtinfo-dev zlib1g-dev pandoc
will install these dependencies.
Inspection operations
clang-mutate can output the ASTs for input source file(s) formatted
either as JSON or S-expressions. Output ASTs provide source
information to enable subsequent source rewriting by external tools.
Alternatively, it can be launched in interactive mode, and ASTs can be
explored in a plain text format more similar to source code.
Some of the inspection operations are:
-annotate
: Display the source, annotating statements with their IDs
and AST classes
-ids
: Display the number of statements in the source
-list
: List every statement's ID, AST class, and range
-number
: Display the source, annotating statements with their IDs
-number-full
: Display the source, annotating full statements with their IDs
-cfg
: Include control-flow information in ASTs
Refer to the manual for complete documentation of available commands.
Mutation operands
In clang-mutate, the mutation operands and operators are specified
as separate flags. The stmt1 and stmt2 operands are used to
specify a numeric AST ID to which an operator will be applied. The
value1 and value2 operands are strings, and the file1 and
file2 operands may be used to specify a file whose contents will be
used as the strings for value1 and value2, repectively.
Mutation operations
Some of the mutation operations that are available include:
-cut
: Cut stmt1 from the source
-insert
: Insert value1 before stmt1
-set
: Set the text of stmt1 to value1
-swap
: Swap stmt1 and stmt2 in the source
Refer to the manual for complete documentation of available mutation commands.
Installation
As mentioned above, clang-mutate uses the clang LibTooling
interface, which means that the clang-mutate executable works best
when run from the same directory as clang. Run "make install" to build
and install.
A PKGBUILD file is provided for installation on Arch Linux systems.
clang-mutate has only been tested on Linux, although we don't know
of any reason it should not work in other OSes. If you encounter
issues running in a different OS, please let us know by filing an
issue report or a merge request. We're happy to incorporate changes
that will enable more general execution.
Examples
A running example using the file etc/hello.c. Here is the source:
$ cat etc/hello.c
#include <stdio.h>
int main(int argc, char *argv[])
{
puts("hello");
return 0;
}
Count the number of statements:
$ clang-mutate -ids etc/hello.c --
12
Print the source, annotated with statement IDs:
$ clang-mutate -number etc/hello.c --
#include <stdio.h>
/* 0.1[ */int main(/* 0.2[ */int argc,/* ]0.2 */ /* 0.3[ */char *argv[]/* ]0.3 */)
/* 0.4[ */{
/* 0.5[ *//* 0.6[ *//* 0.7[ */puts/* ]0.7 *//* ]0.6 */(/* 0.8[ *//* 0.9[ *//* 0.10[ */"hello"/* ]0.10 *//* ]0.9 *//* ]0.8 */);/* ]0.5 */
/* 0.11[ */return /* 0.12[ */0/* ]0.12 */;/* ]0.11 */
}/* ]0.4 */
/* ]0.1 */
Cut the return statement:
$ clang-mutate -cut -stmt1=11 etc/hello.c --
#include <stdio.h>
int main(int argc, char *argv[])
{
puts("hello");
}
Replace the string "hello" with the string "good bye":
$ clang-mutate -set -stmt1=9 -value1='"good bye"' etc/hello.c --
#include <stdio.h>
int main(int argc, char *argv[])
{
puts("good bye");
return 0;
}
Use the tools/perforate-loops script to halve the number of loop iterations by
replacing ++ or -- with +=2 or -=2, respectively:
$ make etc/loop
cc etc/loop.c -o etc/loop
$ ./etc/loop
hello 0 10
hello 1 9
hello 2 8
hello 3 7
hello 4 6
hello 5 5
hello 6 4
hello 7 3
hello 8 2
hello 9 1
$ ./tools/perforate-loops etc/loop.c > /tmp/faster.c
$ make /tmp/faster
cc /tmp/faster.c -o /tmp/faster
$ /tmp/faster
hello 0 10
hello 2 8
hello 4 6
hello 6 4
hello 8 2
Use the tools/strings-to script to replace string literals:
$ ./tools/strings-to etc/hello.c foo
#include <stdio.h>
int main(int argc, char *argv[])
{
puts("foo");
return 0;
}