C-To-Assembly-Tests
C-To-Assembly-Tests copied to clipboard
A repository that stores results from converting C code to Assembly. I use this repository to analyze performance with my C code.
C To Assembly Testing
Description
A small repository to store my findings with converting C code to Assembly code along with measuring performance between the different clang optimization levels. I'm starting to learn more about Assembly because I want to understand how programs work on a very low level so I can optimize it the best I can.
I've made the following source files to test with.
- Two source files for copying eight bytes of data from one 8-bit array (8 bytes in size) to another. One source file uses a for loop to achieve this while the other uses the native
memcpy()function. - Two source files for comparing a variable to five values. One source file uses
ifandelse ifwhile the other uses aswitchstatement. - A source file that copies a string and outputs it to
stdout. - Two source files testing a for loop along with seeing if there's a difference when specifying
pragma #unroll xwhich should unroll the for loop and result in better performance in our case.
I'll likely be adding more files to this repository as time goes on.
Dumping Assembly Code
I used clang to emit LLVM and create the .bc file with no optimizations by the compiler (the -O0 flag). An example may be found below.
clang -c -emit-llvm -O0 -o asm/testO2.bc src/test.c
Since we emit LLVM, we may use the llc command to dump the Assembly code under specific optimization levels. I dump both the native architecture's Assembly code and also Intel's Assembly code (these Assembly files are appended with _intel).
Here's an example using optimization level 2 (notice the -O=2 flag in the llc command).
llc -filetype=asm -O=2 -o asm/testO2.s asm/testO2.bc # Native architecture's Assembly code.
llc -filetype=asm -O=2 -o asm/testO2_intel.s --x86-asm-syntax=intel asm/testO2.bc # Intel Assembly code.
NOTE - I'd recommend using the scripts/genassembly.sh Bash script I made to generate Assembly code under optimization levels 0 (None) - 3 and both non-Intel and Intel architectures. The script only requires one argument which is the name of the source file in src/ without the file extension (.c). Also make sure to modify the ROOTDIR variable if you place the script outside of this repository's scripts/ directory. An example may be found below.
./genassembly.sh pointer
Optimization Levels
Clang's optimization levels may be found in its manual page (man clang). For reference, here are the levels:
Code Generation Options
-O0, -O1, -O2, -O3, -Ofast, -Os, -Oz, -Og, -O, -O4
Specify which optimization level to use:
-O0 Means “no optimization”: this level compiles the fastest
and generates the most debuggable code.
-O1 Somewhere between -O0 and -O2.
-O2 Moderate level of optimization which enables most opti‐
mizations.
-O3 Like -O2, except that it enables optimizations that take
longer to perform or that may generate larger code (in an
attempt to make the program run faster).
-Ofast Enables all the optimizations from -O3 along with
other aggressive optimizations that may violate strict com‐
pliance with language standards.
-Os Like -O2 with extra optimizations to reduce code size.
-Oz Like -Os (and thus -O2), but reduces code size further.
-Og Like -O1. In future versions, this option might disable
different optimizations in order to improve debuggability.
-O Equivalent to -O2.
-O4 and higher
Currently equivalent to -O3
You'll notice a lot of optimizations within the Assembly code from -O1 to -O3.
System
This was all tested on my Linux VM running virtio_net drivers and Ubuntu 20.04 Server. The Linux kernel the tests in asm/ were built with was 5.15.2-051502-generic.