inko icon indicating copy to clipboard operation
inko copied to clipboard

Don't pass the state and current process as arguments

Open yorickpeterse opened this issue 1 year ago • 2 comments

For every compiled method, the first two arguments are the runtime state and the current process. This means that fn foo(a: Int) translates to essentially fn foo(state: Pointer[UInt8], process: Pointer[UInt8], a: Int).

This approach isn't great, as we're wasting up to two registers to pass this data around, and in many cases the data likely isn't used much.

To optimize this, I'm thinking of the following:

  • The state is the same for all methods and processes, so we can generate a global variable and store it in there. The runtime functions still take an explicit state argument, such that it doesn't need to depend on the global variable generated by the compiler.
  • The process could be stored as the last value in the stack (that we grow towards), and the stack range adjusted to not allocate into that data. This way we can obtain the process easily. I'm not sure though how feasible/cross-platform this is.

yorickpeterse avatar Oct 16 '23 22:10 yorickpeterse

Using external thread-local variables in Rust requires nightly, and probably will continue to require this for a long time: https://github.com/rust-lang/rust/issues/29594

yorickpeterse avatar Oct 16 '23 22:10 yorickpeterse

A tricky thing about using the stack is that LLVM doesn't seem to provide any intrinsics for obtaining any kind of stack information. This means we'd have to use raw assembly somehow to get the data from the stack.

yorickpeterse avatar Oct 16 '23 22:10 yorickpeterse

For thread-local code, the following Rust code compiles to the same as regular/raw thread-locals:

thread_local! {
  static PTR1: Cell<*mut ()> = const { Cell::new(std::ptr::null_mut()) };
}

This can be seen at https://rust.godbolt.org/z/v16va86aq.

The problem is that I'm not sure if this is true for every platform. Some additional details are found at https://matklad.github.io/2020/10/03/fast-thread-locals-in-rust.html.

yorickpeterse avatar Feb 17 '24 02:02 yorickpeterse

A quick dive through the current code reveals we don't use the current process value in all that many places, mostly to pass it as an implicit argument to methods. The few runtime routines that require it could instead just use a thread-local variable kept entirely on the runtime side of things.

The only instruction that really needs it is the Preempt instruction as it checks the process-local epoch against the global epoch. We could probably make that epoch counter a thread-local variable as well, as we only write to it when resuming the process. This would probably also reduce the process size a little bit.

yorickpeterse avatar Feb 17 '24 03:02 yorickpeterse

It seems that when one uses #[no_mangle] in the thread_local! macro, mangling is still applied to the constant. This can be seen in https://rust.godbolt.org/z/qWzaq8qze where PTR1 is mangled as example::PTR1::__getit::VAL.0 but PTR2 is just PTR2.

yorickpeterse avatar Feb 18 '24 04:02 yorickpeterse

Looking at the assembly, it also seems Rust uses LLVM's localdynamic for the thread_local! variable, while using generaldynamic for the #[thread_local] version.

yorickpeterse avatar Feb 18 '24 04:02 yorickpeterse