asmdot
asmdot copied to clipboard
[Unstable] Fast, zero-copy and lightweight (Arm | Mips | x86) assembler in (C | C++ | C# | Go | Haskell | Javascript | Nim | OCaml | Python | Rust).
ASM.
Providing an extensible Python framework for building a fast, zero-copy assembler.
History and goals
This project originally aimed to create a fast, minimalist and unopinionated assembler in C that could live in a single file, and support multiple architectures.
Thus, a Python library was built to transform various instructions from different architectures
into a simple, common AST that supports bitwise and logical expressions, basic flow control
and variables into C code.
Since code would be generated automatically, other options such as naming conventions and parameter
types could be easily modified when generating it.
However, I soon realized that, since a complete AST was built, it could be easily to extend this
process to not only support C, but also other programming languages.
At first, the goal was thus to produce bindings to the C API, which is very efficient; but since a
complete AST was built anyway, and that a mechanism already existed to distinguish source files and
include files, I decided to make the whole process available in different languages.
As such, ASM. was born. Parsers transform data files that define instructions in various architectures to an AST, which is then transformed by emitters into source code in various programming languages.
Goals and non-goals
- ASM. is a lightweight assembler library. It is designed to be as simple as possible.
- ASM. has no support for labels or macros: developers are expected to build their own interface on top of the provided functions.
- ASM. is not a binary, it's a library. You cannot use it directly.
- ASM. has no built-in parser: if you want an assembler that works with arbitrary strings, use Keystone.
- ASM. has different instructions for different architectures: if you want a common interface for all architectures, use GNU Lightning or libjit.
Usage
Using make
A Makefile is provided to automate most tasks, including generating sources, as well as building and testing every generated library.
The emit
, build
and test
recipes are made available, and invoke all language-specific
recipes that are defined. To execute tasks in a language-specific manner, the recipes
emit-lang
, build-lang
, and test-lang
are also available, where lang
is either one
of these values:
-
c
(uses any C compiler). -
cpp
(uses any C++ compiler). -
csharp
(usesdotnet
). -
go
(usesgo
). -
haskell
(usescabal
, doesn't compile; help needed). -
javascript
(usesnpm
). -
nim
(usesnimble
). -
ocaml
(usesdune
, doesn't compile; help needed). -
python
(usespytest
). -
rust
(usescargo
).
Generating the sources
Each language directory contains a generate.py
file, which can be directly invoked
from the command line.
Here is an example output of the C generation script:
usage: generate.py [-h] [-ns] [-nt] [-o output-dir/] [-v] [-np] [-ah]
[-cc CALLING-CONVENTION]
Generate ASM. sources.
optional arguments:
-h, --help Show the help message.
-ns, --no-sources Do not generate sources.
-nt, --no-tests Do not generate tests.
-be, --big-endian Use big-endian instead of little-endian.
-o output-dir/, --output output-dir/
Change the output directory (default: directory of
calling emitter).
-v, --verbose Increase verbosity (can be given multiple times to
increase it further).
C:
-np, --no-prefix Do not prefix function names by their architecture.
-ah, --as-header Generate headers instead of regular files.
-cc CALLING-CONVENTION, --calling-convention CALLING-CONVENTION
Specify the calling convention of generated functions.
Using the C API
#include "./x86.h"
void* buffer = malloc(0xff);
void* origin = buffer;
inc_r32(&buffer, eax);
ret(&buffer);
free(origin);
Using the Nim API
# The Nim language goes very well with ASM., thanks to its UFCS support.
import asmdot/x86
var
bytes = newSeqOfCap[byte](10)
buf = addr bytes[0]
buf.inc(eax)
buf.ret()
Using the Python API
from asm.x86 import Reg32, X86Assembler
asm = X86Assembler(10)
asm.inc_r32(Reg32.eax)
asm.ret()
Using the Rust API
use asm::x86::{Register32, X86Assembler};
let mut buf = vec!();
buf.inc_r32(Register32::EAX)?;
buf.ret()?;
Installing
We're not there yet, but if you want to experiment with the project or contribute, you're welcome to clone it and play around.
# Clone project
git clone https://github.com/71/asmdot.git
# Get dependencies
python -m pip install -r asmdot/requirements.txt
# Play around
PYTHONPATH=. && python languages/c/generate.py --help
Status
Architectures
- ARM: WIP.
- MIPS: WIP.
- X86: WIP.
Sources
- C
- C++
- C#
- Go
- Haskell
- JavaScript
- Nim
- OCaml
- Python
- Rust
Docs
The directory of each language (list available above) contains the documentation for said language. Furthermore, a hacking guide is available for those who want to extend or improve ASM.
License
All the content of the repository is MIT-licensed, except the data directory which is Unlicensed.