datashield
datashield copied to clipboard
Datashield
License
For LLVM see: compiler/llvm/LICENSE.txt
For our code: this code is dual-licensed under the LLVM license (University of Illinois/NCSA Open Source License) and the GPLv3. Choose whatever fits your requirements.
Building the Compiler
0. Setup and environment
We built DataShield in 2016, so baseline your software around that date. You can reproduce our experiments when using
- Ubuntu 16.04 LTS
clang-3.9,ninja-build,cmake,binutils-dev(of Ubuntu 16.04)- If you have more than one clang installed, either adapt the makefiles or use
update-alternatives clangto setclangtoclang-3.9andclang++toclang++-3.9
1. Clone this repo
Clone this repo into $HOME/research
There are a lot of build scripts to glue everything together and make the build reproducible. The scripts all assume you cloned in the above directory.
cd ~
mkdir research
cd research
git clone [email protected]:HexHive/datashield.git
2. Build the compiler
The compiler is built using the normal LLVM build process. Consult the LLVM documentation if you have trouble.
You may use my build scripts or change the options if you know what you're doing.
DataShield has 3 different configurations:
- debug - debug info, unoptimized, with instrumentation
- baseline - optimized, no instrumentation
- release - optimized, with instrumentation
If you just want to experiment with DataShield debug might be the best:
cd ~/research/datashield/compiler
mkdir build-debug
cd build-debug
../lto_cmake_debug.sh
ninja install
You can build release if you care about compile times:
cd ~/research/datashield/compiler
mkdir build-release
cd build-release
../lto_cmake_release.sh
ninja install
Baseline is the same as release for the compiler since the compiler itself is not instrumented, but you need to build it if you want a baseline comparison for benchmarking:
cd ~/research/datashield/compiler
mkdir build-baseline
cd build-baseline
../lto_cmake_baseline.sh
ninja install
4. Build libc
To build a configuration, just run build-$configuration
cd $HOME/research/datashield/libc
./build-debug.py
./build-baseline.py
./build-release.py
The scripts create a different install directory for each configuration, so you don't have to rebuild every time you want to test a different configuration. They are:
$HOME/research/datashield/ds_sysroot_debug
$HOME/research/datashield/ds_sysroot_baseline
$HOME/research/datashield/ds_sysroot_release
4. Build libcxx
Building libcxx is basically the same as building libc. It has the same three configurations. Running build.py <config> builds everything. Otherwise run build.py with no arguments for a help message.
cd $HOME/research/datashield/libcxx
./build.py <config>
Compiling Instrumented Programs
You need a lot of options to be able to build with our custom libc, libcxx, and
various protections. There are scripts in $HOME/research/datashield/bin that make this much easier.
The scripts directory ($HOME/research/datashield/bin) needs to be in your PATH for the scripts to work.
Build Hello World
First, you should build a "Hello World" program to make sure your build is sane.
cd $HOME/research/datashield/test/hand-written/hello_world
make
./test
It should print out a whole bunch of log information and "hello world" and "good bye." If not, something is seriously wrong and you should create a GitHub issue.
If you look at
$HOME/research/datashield/test/hand-written/hello_world/Makefile you will see
that there are multiple options for the variable CC. To chose which set of
options you want you just change CC to one of the scripts in
$HOME/research/datashield/bin. They all start with musl-clang-*.
Build Hello World in C++
Building C++ (versus C) is basically the same but you need to use the C++ scripts. It's a good idea to build "Hello World" in C an C++ as a sanity check.
cd $HOME/research/tests/hand-written/hello_worldxx
make
./test
Compiler options
You should use the scripts, but if you need/want to change something and know what you're doing you can give options manually. These are the options that the scripts setup:
-datashield-ltoenables the datashield pass (required)-T../../linker/linker_script.ldsthis passes our script to the linker (required)-datashield-debug-modeprints debug logs at runtime-datashield-save-module-aftersaves the compiled module to a file after datashield's transformation-datashield-save-module-beforesaves the compiled module to a file before datashield's transformation-debug-only=datashieldprints debug logs at compile time
The following are mutually exclusive:
-datashield-use-maskuse the software mask coarse bounds check options-datashield-use-prefix-checkgive this option if you want prefix or MPX
The following are mutually exclusive:
-datashield-use-prefixgive this option if you want prefix or MPX-datashield-use-late-mpxgive this option if you want prefix or MPX
Must be used with -datashield-use-mask
-datashield-intergity-only-modeonly protect stores-datashield-confidentiality-only-modeonly protect loads-datashield-separation-modebasic arithmetic does not propagate sensitivity
Two options for compiling system libraries:
-datashield-library-modefor compiling libraries with sandboxing only-datashield-modularrun the pass without LTO