Dependency
Dependency copied to clipboard
1. what the artifact does
This artifact is for paper "Demystifying the Dependency Challenge in Kernel Fuzzing". Fuzz testing operating system kernels remains a daunting task to date. One known challenge is that much of the kernel code is locked under specific kernel states and current kernel fuzzers are not effective in exploring such an enormous state space. We refer to this problem as the dependency challenge. Though there are some efforts trying to address the dependency challenge, the prevalence and categorization of dependencies have never been studied. Most prior work simply attempted to recover dependencies opportunistically whenever they are relatively easy to recognize. We undertake a substantial measurement study to systematically understand the real challenge behind dependencies. In one word, the artifact is to help researchers to understand the dependency challenge in kernel fuzzing.
2. where it can be obtained
Virtual Machine and other files ready for Artifact Evaluation
- username & password: icse22ae
- zenodo archive:
https://doi.org/10.5281/zenodo.6029158 - also available in Google driver:
https://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
Source Code
- zenodo archive:
https://doi.org/10.5281/zenodo.6029520 - github and update:
https://github.com/ZHYfeng/Dependency
Evaluation Data
- zenodo archive:
https://doi.org/10.5281/zenodo.5441138 - also available in Google driver:
data.tar.gzinhttps://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
3. how to repeat/replicate/reproduce the results presented in the paper
build our tools (skip this step if using virtual machine)
sudo apt install -y git
git clone https://github.com/ZHYfeng/Dependency.git
cd Dependency
bash build_script/build.bash
prepare kernel and image (skip this step if using virtual machine)
- configure the kernel and image based on the requirement of syzkaller, mv image to
path-of-Dependency/workdir/imagedoc of syzkaller: https://github.com/google/syzkaller/blob/master/docs/linux/setup_ubuntu-host_qemu-vm_x86-64-kernel.md
the image we build: image.tar.gz inhttps://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing - add
-fsanitize-coverage=no-prunetoCFLAGS_KCOVin kernel config - build kernel using clang and mv it to
path-of-Dependency/workdir/13-linux-clang-npthe kernel we build: linux-clang-np.tar.gz in
https://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing - copy the kernel and generate bitcode of kernel using
-fembed-bitcode -save-temps=objhttps://github.com/ZHYfeng/Generate_Linux_Kernel_Bitcode/tree/master/Achieve/01-change-makefile
the bitcode we build: linux-clang-np-bc-f.tar.gz inhttps://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing - preprocess kernel in order to save time
cd path-of-Dependency/workdir/13-linux-clang-np objdump -d vmlinux > vmlinux.objdump a2l -objdump=vmlinux.objdump
prepare workdir (skip this step if using virtual machine)
the workdir we prepare: workdir.tar.gz in
https://drive.google.com/drive/folders/1Ts4P4iC2PHihtBviSXMUkn3My0PLkowN?usp=sharing
- make a directory called
dev_xxxinpath-of-Dependency/workdir - copy the bitcode(.bc) and assembly code(.s) to the directory and rename it to
built-in.bcandbuilt-in.s - copy the configuration files
path-of-Dependency/04-experiment_script/json/dra.jsonandpath-of-Dependency/04-experiment_script/json/syzkaller.json.change the value of
file_bcindra.jsonto the relative path for the bitcode of device driver you test
change the value ofpath_sindra.jsonto the relative path of device driver you test - copy the run script
path-of-Dependency/04-experiment_script/python/run.py - generate static analysis results based on the static-taint-analysis-component
https://zenodo.org/record/5348989/files/static-taint-analysis-component.zip
running the fuzzing
(the path based on virtual machine)
-
active the environment
source /home/icse22ae/Dependency/environment.sh -
pick one device driver in
/home/icse22ae/Dependency/workdir/workdir, for examplecdrom:cd /home/icse22ae/Dependency/workdir/workdir/dev_cdrom -
configure the run script
time_run: the second of fuzzing time.
number_execute: the number of fuzzing runs.
number_vm_count: the number of vm in each fuzzing.In our paper,
time_runis at least 48 hours,number_executeis 3 andnumber_vm_countis 32.
For artifact evaluation,number_executeandnumber_vm_countcould be 1.
time_runshould be at least 5 mins(20 mins for device driver kvm) -
run our tool using script It will automatically stop after
time_run.python3 run.py -
read the results
still in the same environment in step 1 and the same path in step 2.go run /home/icse22ae/Dependency/03-syzkaller/tools/read_result/ -a2iBased on the different fuzzing configuration and device driver, the time would be differnet.
For cdrom, it should be several mins. For kvm, it needs several hours.
understand the results
You can find the results used in our paper in /home/icse22ae/Dependency/workdir/data.
Results after step 4 run our tool using script
- The
dataDependency.bin,dataResult.bin,dataRunTime.bin,statistics.binin./0or./1or./2are the resutls in protobuf format.The protobuf files are in
/home/icse22ae/Dependency/05-proto
Results after step 5 read the results
0_coverage.txtis the coverage of the fuzzing in./0.coverage.txtis the average coverage of all runs.Each line istime@number-of-edge.conditionD.txtlists all unresolved condition related to dependency.conditionND.txtlists all unresolved condition not related to dependency.conditionDN.txtlists all unresolved condition related to dependency but our static analysis can not find their write statements.intersection.txtis the intersection coverage of all runs andunion_coverage.txtis the union coverage of all runs. Each line is the address of the edge.OutsideFunctions.txtis theUnreachable Functions Eliminationmentioned in our paper.statistic.txtis the statistic used in our paper.uncovered.txtlists all uncovered edge and its unresovled conditions, anduncovered_more.txtlists more details about them.
Example for one unresolved dependency
Still use dev_cdrom as example and the results can be found in data.tar.gz as mentioned in Section Evaluation Data
All unresolved condition related to dependency in conditionD.txt, for example:
0xffffffff8579b9b7@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279@0xffffffff8579b960@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279@[email protected]@
@ @0xffffffff857a3eaa@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2124@1@
@ @0xffffffff8579b421@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2228@0@
@ @0xffffffff8579b05a@https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2187@1@
0xffffffff8579b9b7 is the assembly address of unresovled branch in binary and https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279 is the source code of the unresolved dependency. 0xffffffff8579b960 is the assembly address of condition of the unresovled branch and also https://elixir.bootlin.com/linux/v4.16/source/drivers/cdrom/cdrom.c#L2279 is the source code. if.end11.i is the name of basic block in LLVM bitcode.
Next lines are the write addresses for the unresolved dependency.
Then we can find a file 0xffffffff8579b9b7.txt, which is named by the assembly address of unresovled branch.
Inside this file, we can find the number of dominator instructions of this unresolved dpendnecy,
the inputs (test cases) from syzkaller which can arrive unresolved dpendnecy, the inputs which can arrive the write address.
We can also find the call chain of write address starting from entry function.
4. the structure and function of the source code
02-dependency02-dependency/lib/DMM/: mapping between assembly address in the binary and basic block in LLVM bitcode02-dependency/lib/RPC/: work with fuzzing component (syzkaller) using Protobuf and gRPC02-dependency/lib/STA/: work with static analysis component using JSON02-dependency/lib/DCC/: output human-readable information and statistics for unresolved conditions
03-syzkaller03-syzkaller/syz-fuzzer/: modification for collecting more complete coverage and other related useful information from fuzzing03-syzkaller/pkg/dra/: work with mapping component and output results using Protobuf and gRPC
05-proto: all Protobuf files