oakc icon indicating copy to clipboard operation
oakc copied to clipboard

Implement foreign functions to have their own representation in the different IR levels

Open kevinramharak opened this issue 3 years ago • 2 comments

I was banging my head against the current implementation of extern fn's as it wraps every external function declaration into an oak function. This would generate code to build up/tear down the stackframe and passes all the variables to the actual foreign function.

This PR is working towards more flexibility by being able to represent extern fn declarations as their own thing inside each IR level. The old code generated an oak function with a (return) statement of the foreign function call as single statement inside its body. I moved that logic from the Tir level to the Mir level as this currently does not change anything in the generated code.

I would like to propose to implement hooks for the Target implementations to generate the following code:

  • setup logic to pass the parameters to the foreign function in a manner that does not require the foreign function to know how to manage the oak vm.
  • call the foreign function (this is already done with Target::call_foreign_function(name: String)
  • setup logic to pass the return value back to the vm

Currently you have to write code in the target language to wrap any foreign functions before you can use them. These changes would generate all the needed code for you.

The challenges i'm facing now:

  • Make it simple. Generating code to pass a dynamic amount of variables for each function call requires the Target implementation to know a bit about the Asm representation.
  • Make it generic. Preferably the current behaviour would be the default (set up stack/variables and tear them down after the foreign function call). If a Target implements certain hooks then the default would be ignored and the compiler relies on the implementation of the target backend.

Added bonus is that this PR also allows the documentation generator to differ between native and foreign functions.

Any idea's?

kevinramharak avatar Dec 28 '20 21:12 kevinramharak

Is it just the generated code that builds and tears down the stack that you don't like? In that case I think the best course of action would be to find a way to eliminate establishing and destructing the stack frame in the code generator.

There is a way I can imagine to do this, but I can see some problems with how it could be implemented.

Here's my idea: in the asm.rs stage of compilation, optimize functions that

  1. Have a single statement which is a return statement.
  2. That return statement returns a function call (or foreign function call).
  3. The function call passes all of the arguments of the overall function definition (the function being optimized) in this call, in order.

In this specific case, no stack establishment / destruction is necessary. If we optimize this specific case away, then we don't need to change the Target implementation. Preferably, I want to avoid changing the Target impl as much as possible.

I'm very wary of introducing new functionality, because I prefer robustness and stability over features / better code generation, but I think that could be a valid patch.

Is there a particular reason the stack establishments / destructions are causing problems? Or are they just slowing your programs down / too many CPU cycles for good performance?

adam-mcdaniel avatar Dec 30 '20 02:12 adam-mcdaniel

Maybe im making this more complicated than needed. The problem im trying to solve is the following:

I used to be able to write something like this

#[doc("Finds the first occurence of `needle` in `haystack` up till `length` amount of words. Returns a pointer to the found `needle` location or `NULL` if not found.")]
fn C::memchr(haystack: &void, needle: char, length: num) -> &void {
    __ffi_pass_arg_by_value!(length);
    __ffi_pass_arg_by_value!(needle);
    __ffi_pass_arg_by_reference!(haystack);
    // this external function uses the native stack which has been prepared by the above function calls, has no knowledge of the oak vm
    __c_memchr!();
    return __ffi_pass_return_value_as_reference!() as &void;
}

Where __ffi_* functions would be the bridge between the Oak memory tape and the target calling convention. They would retreive a value from the oak tape and push it on the native stack or vice versa. Now this takes a decent amount of effort to write the boiler plate, but it could be generated if needed.

With the extern syntax this seems not possible anymore

// when you declare a foreign function
extern fn __internal as native(value: num) -> num;

// Tir wil generate the following function
fn native(value: num) -> num {
  return __internal!(value);
}

So whenever you would call it like native(2) it would setup the arguments for native() and drop them after. Then setup the same arguments for __internal!(value) and tear them down after. Because those __ffi_* functions were also foreign functions the previous idea gets really complicated.

This PR aims at solving that problem by having the Target define how the calling convention should be translated. Maybe there is an easier way to solve this. But my idea was to be able to call extern fn's from user code without having to do this translation step yourself.

kevinramharak avatar Dec 30 '20 12:12 kevinramharak