motoko experiment: custom RTS functions

This PR adds utility macros, traits, and functions which can be used to implement Rust FFI bindings in the RTS.

I'm currently focusing on just a few Motoko types (primarily Blob, Array, tuples, and numeric primitives). It's relatively simple to expand support by implementing FromValue and IntoValue for other Rust types.

Rust functions (included in PR for now):

#[motoko]
unsafe fn empty() {}

#[motoko]
unsafe fn identity(value: Value) -> Value {
    value
}

#[motoko]
unsafe fn div_rem(a: u32, b: u32) -> (u32, u32) {
    (a / b, a % b)
}

#[motoko]
unsafe fn array_concat(a: Vec<Value>, b: Vec<Value>) -> Vec<Value> {
    [a, b].concat()
}

#[motoko]
unsafe fn blob_modify(mut blob: BlobVec) -> BlobVec {
    blob.0.push('!' as u8);
    blob
}

#[motoko]
unsafe fn manual_alloc(#[memory] mem: &mut impl Memory) -> Value {
    // Low-level access to memory allocation
    let value = alloc_blob(mem, Bytes(3 as u32));
    let blob = value.as_blob_mut();
    let mut dest = blob.payload_addr();
    for i in 0..3 {
        *dest = (i + 1) * 0x11;
        dest = dest.add(1);
    }
    allocation_barrier(value)
}

#[motoko]
unsafe fn bool_swap(a: bool, b: bool) -> (bool, bool) {
    (b, a)
}

#[motoko]
unsafe fn check_numbers(
    a: u8,
    b: i8,
    c: u16,
    d: i16,
    e: u32,
    f: i32,
    g: u64,
    h: i64,
) -> (u8, i8, u16, i16, u32, i32, u64, i64) {
    (a, b, c, d, e, f, g, h)
}

Motoko usage (ffi.mo):

import Prim "mo:prim";
import Array "mo:base/Array";
import Blob "mo:base/Blob";

// Rust bindings
func empty() : () = (prim "rts:empty" : () -> ())();
func identity<T>(value : T) : T = (prim "rts:identity" : T -> T)(value);
func blob_modify(value : Blob) : Blob = (prim "rts:blob_modify" : Blob -> Blob)(value);
func array_concat<T>(a : [T], b : [T]) : [T] = (prim "rts:array_concat" : ([T], [T]) -> [T])(a, b);
func manual_alloc() : Blob = (prim "rts:manual_alloc" : () -> Blob)();
func div_rem(a : Nat32, b : Nat32) : (Nat32, Nat32) = (prim "rts:div_rem" : (Nat32, Nat32) -> (Nat32, Nat32))(a, b);
func bool_swap(a : Bool, b : Bool) : (Bool, Bool) = (prim "rts:bool_swap" : (Bool, Bool) -> (Bool, Bool))(a, b);
type Numbers = (Nat8, Int8, Nat16, Int16, Nat32, Int32, Nat64, Int64);
func check_numbers(a : Nat8, b : Int8, c : Nat16, d : Int16, e : Nat32, f : Int32, g : Nat64, h : Int64) : Numbers = (prim "rts:check_numbers" : Numbers -> Numbers)(a, b, c, d, e, f, g, h);

// `empty`
assert empty() == ();

// `identity`
let echoValue = identity(5);
Prim.debugPrint(debug_show echoValue);
assert echoValue == 5;

// `div_rem`
let (div, rem) = div_rem(7, 2);
assert (div, rem) == (3, 1);

// `array_concat`
let a = Array.freeze(Array.init<Nat8>(10_000_000, 123 : Nat8));
let b = Array.freeze(Array.init<Nat8>(500_000, 234 : Nat8));
let concat = array_concat(a, b);
assert concat.size() == a.size() + b.size();
assert concat[0] == 123;
assert concat[concat.size() - 1] == 234;

// `blob_modify`
let inputBlob = Blob.fromArray(Array.freeze(Array.init<Nat8>(10_000_000, 123 : Nat8)));
let blob = blob_modify(inputBlob);
let array = Blob.toArray(blob);
assert array[0] == 123;
assert array[array.size() - 1] == 33; // '!'
assert blob.size() == inputBlob.size() + 1;

// `manual_alloc`
let allocValue = manual_alloc();
assert Blob.toArray(allocValue) == [0x11, 0x22, 0x33];

// `bool_swap`
for (a in [true, false].vals()) {
    for (b in [true, false].vals()) {
        assert bool_swap(a, b) == (b, a);
    }
};

// `check_numbers`
let numbers: Numbers = (1, -2, 3333, -4444, 5_000_000, -5_000_000, 0, -1_000_000_000_000_000);
assert check_numbers(numbers) == numbers;

Try this yourself (with placeholders ffi.mo and ../motoko-base):

MOC_UNLOCK_PRIM=1 moc -c ffi.mo -wasi-system-api --package base ../motoko-base/src && wasmtime ffi.wasm

Changes:

[x] rts_sections in Wasm module decoder
[x] custom_rts_functions field in compilation environment
[x] "rts:*" primitive functions which refer to names in the custom section
[x] Bugfix for decoding custom sections with UTF-8 content
[x] FromValue and IntoValue traits in RTS
[x] #[motoko] procedural macro attribute which wraps #[ic_mem_fn] and generates a custom section
[x] Bump proc-macro2 and syn in the motoko-rts-macros crate
[x] Example RTS functions using #[motoko] attribute
[x] Macro to implement FromValue and IntoValue for tuples
[ ] Re-vendor Cargo dependencies in Nix (are these instructions up to date?)
[ ] Type checking or runtime error for unknown "rts:*" primitive expressions?
[ ] Convert examples into tests

Mar 09 '24 01:03 rvanasa

Very nice PR, Ryan. Thanks a lot. This offers a well-structured, convenient small framework for FFI implementations. As you say, it would still require advanced knowledge to implement FFI. (Especially, also the GC aspects, e.g. keep Rust pointers only temporarily, applying the right GC barriers etc.). The only worry I have is that users could easily break the memory safety and that we would then get issue reports of memory corruptions in Motoko (which could be time-consuming to invest and also maybe influence the safety reputation of Motoko). I guess people could still do this today by adjusting Motoko compiler/RTS on their own, but I believe now it would be easier. I wonder if we could reduce this risk, i.e. instruct users about all the safety/security rules and aspects for FFI functions, have an explicit opt-in for this, and/or apply additional steps when triaging issue reports that we can filter out Motoko code where users apply FFI functions (e.g. having a question before reporting to indicate whether FFI was used). Maybe my worry is exaggerated. I am interested what others team colleagues think, @crusso , @ggreif , @chenyan-dfinity.

Mar 19 '24 09:03 luc-blaeser

PS: I believe we could add some more tests for the FFI. I could also do some more stress testing with the GC, e.g. composing an additional GC random test or benchmark case that makes use of FFI.

Mar 19 '24 09:03 luc-blaeser

The only worry I have is that users could easily break the memory safety and that we would then get issue reports of memory corruptions in Motoko (which could be time-consuming to invest and also maybe influence the safety reputation of Motoko).

This is a really good point @luc-blaeser. Because safety is a key part of Motoko's brand, this by itself makes a fairly strong case for developers to avoid this FFI approach when possible. One possibility could be to repurpose this PR as an internal refactor (implementing the built-in RTS functions using the #[motoko] macro). We could potentially keep the Wasm custom section to give advanced developers the option to extend the RTS where it would otherwise be impossible to use Motoko for their use case.

While this functionality is currently opt-in via the MOC_*_RTS environment variables, I suppose we could also include a compiler flag or something that explicitly allows custom RTS functions (or maybe even switch back to using the original logic in https://github.com/dfinity/motoko/pull/4413). Also interested to hear more opinions from the rest of the team about how we could address this.

Mar 19 '24 19:03 rvanasa