Requesting response in regards to implementation of Orc engine.
Hope I'm not bothering you too much.
I'm just wrapping up work on the Orc V1 implementation (it's taking longer than I thought), but I've come across some issues that I would like your feedback on before continuing on my implementation.
In the Orc V1 engine, the JIT Stack (basically the execution engine) takes ownership of modules that are added to it. This doesn't seem like a problem at first glance, because it's not too difficult to do this. Just pass the entire module to the Orc Engine and allow the orc engine to handle ownership. The problem comes with the lifetime of the Context, because the module will still be tied to the context that created it. That's why I think it would be a good idea to turn the context into a reference counted smart pointer wrapper, and give a clone of the context to the module to keep the context alive. Then when the module is added to the Orc Engine, the smart pointer to the context is moved into the Orc Engine's "keep alive" system. This could be made possible by placing ContextImpl into an Arc, and forcing Context to be !Send and !Sync by adding a PhantomData<*const ()> field. This could be changed later on so that Context could be thread-safe if need-be.
Another issue that I came across has to do with symbol resolution in the Orc Engine. As far as I'm aware, resolution goes like this: search external symbols in inverse order (LIFO) that modules were added, then when the symbol is not found in any of those modules, the user supplied symbol resolver for each module in LIFO order is searched. The goal for me is to make it so that there's a final fallback global symbol table for things to resolve to when all else fails. The solution that I've come up with to solve this problem is for the Orc Engine to create a dummy module that resolves to the global table for the express purpose of having a fallback lookup table. This dummy module would be added as soon as the engine is created.
Some other things aren't issues I needed feedback on, but I figured I would give you an update on.
I added lockfree_linked_list.rs to /src/. It's, as the name implies, a lock-free linked list. I use it in the Orc Engine implementation to make a thread-safe list to append to in order to keep lazy compile callbacks in memory. I added to the crate-level namespace because I figured that it might be useful in other parts of the crate, especially Orc2.
I also added listeners.rs, which is for JITEventListeners. I don't know if these event listeners are exclusive to Orc or not, so I just added the module to the crate-level namespace as a precaution. They didn't seem Orc specific since they aren't in the Orc namespace.
I created a neat FunctionAddress abstraction as well as a fn_addr! macro. This simplifies creating LLVMOrcTargetAddresses from from extern functions.
let add_addr0 = fn_addr!(jit_functions::add : (i32, i32) -> i32);
// optionally, you can include `fn`, `extern "C" fn`, or `unsafe extern "C" fn` before the parameter list:
let add_addr1 = fn_addr!(jit_functions::add : fn(i32, i32) -> i32);
let add_addr2 = fn_addr!(jit_functions::add : extern "C" fn(i32, i32) -> i32);
let add_addr3 = fn_addr!(jit_functions::add : unsafe extern "C" fn(i32, i32) -> i32);
// When the function has no arguments nor return type, you can exclude the signature.
let say_hello_addr = fn_addr!(jit_functions::say_hello);
The fn_addr macro makes it much easier to safely get a function address without allowing users to pass in raw values.
Here's another example from my sandbox:
extern "C" fn bar(num: i32) {
println!("extern \"C\" fn bar({num})");
}
let func_addr = engine.create_lazy_compile_callback(|engine| {
println!("Lazy Compile Callback was called.");
fn_addr!(bar : (i32))
}).unwrap();
Here, you can see fn_addr in action, but you can also see how the creation of a lazy_compile callback works. The lazy compile callback is simply a trampoline function internal to the JIT stack that will call your provided function to lazily compile some function, then the address of the callback will be replaced with the address that you return. It's pretty neat, and it was a little complicated to make it work with a closure rather than a raw extern "C" function.
Lastly, I added an UnsafeOrcFn trait similar to the UnafeFunctionPointer trait. I even implemented it in a similar way. I just wanted to have 32 arguments for the Orc Engine so that there would be less limitation in the safe API. Ideally, I would like to create a custom proc macro for generating these parameterized implementations. That way the argument names for the call functions would not be PascalCase, and could instead be arg0: T0, arg1: T1, arg2: T2, arg3: T3, .... I didn't want to add a barely used proc macro to the crate without your blessing, though, so I didn't do so. And just in case you're curious, I tested compile times both with my 32 parameters macro and without it, and there was no discernible difference.
Oh, I almost forgot. There's also a problem with Eager compilation that I believe is a Windows problem. External symbols from modules aren't exported for resolution, which means that if you add an eagerly compiled module, you can't lookup functions inside of it, which makes it functionally useless. When I looked into it, I recall reading something about the LLVM team not having enough time to implement the proper symbol export stuff for Windows, but I don't have a link to it at the moment. I think I mentioned this in the other thread. I didn't remove the ability to do eager compilation on Windows, I just added a disclaimer that it might not work. Until we can do further testing with different environments, I have no way to know if it's a Windows exclusive thing, or just something wrong with my build of LLVM.
Changing the entire crate to be Arc based is less than ideal, as it means most types are no longer simple to copy and move around without explicit clones everywhere
That can be a difficult thing to work around, since that means that users will be expected to keep the context that created the module alive while the Orc Engine is alive. It's undefined behavior if they destroy the context before the code has been compiled.
Can't we move the context into the orc engine? I suppose that may not be possible with the global context
I don't actually think it would be a problem to move ContextImpl into an Arc inside of Context, since Context isn't Clone nor Copy. Then Module could own an Arc<ContextImpl> to keep the context alive, then for Module::get_context, it could just return a ContextRef from the Arc<ContextImpl>, but of course the ContextRef would have a lifetime associated with the Module to prevent use-after free errors. I already tried swapping out the ContextImpl with Arc<ContextImpl>, and there was only a few lines of code that I needed to change to make it work. If this change were implemented, Module and Builder would no longer need lifetime parameters because they would keep the context alive. Neither Module nor Builder are Clone or Copy, so adding in the Arc wouldn't change anything either.
One problem would be the create_builder and create_module functions in ContextImpl, they wouldn't have an Arc<ContextImpl> to pass to the Module nor Builder.
Oh, and to answer your question, moving the context into the OrcEngine is one option, but then you wouldn't be able to use that context anymore. Another option is to create a thread-safe context for the OrcEngine.
Edit: Sorry for the excessive comments, I should have consolidated this all into one comment.
No worries. We can still dole out ContextRef which is a type we already provide
Or the orc module could take a ContextRef as input 🤔
Edit: as long as it's associated with the context lifetime, it doesn't even need a full context ref as a field
as long as it's associated with the context lifetime, it doesn't even need a full context ref as a field
Yeah, that's a problem, really. You either restrict it to one context per orc engine, pass the context and module into the orc engine, or you make it so that you can keep the context alive. I think the easiest option is to make it so that you can keep the context alive, and it doesn't really break that much. It just means that you would have to clone the context instead of copying a ContextRef. Cloning is pretty cheap with Arc, so it's just a little extra syntax to create a new one.
I think all the options aren't ideal, because no matter what they restrict the user is some way, or make software written with the orc engine unsafe. But the option that restricts the user the least is the Arc option.
Edit: It also removes the necessity for the lifetime in Module and Builder, but the lifetime could be kept anyway just so that you can still create a Module/Builder from a ContextRef.
I just discovered another interesting quirk. If you manually drop the Context before you drop the OrcEngine, there will be a status access violation. I think that the OrcEngine needs the context to be alive while it's being dropped. I added a mechanism to keep objects alive with the OrcEngine, but it's not an ideal solution.
One possible solution is to use a single context for the entire OrcEngine, then require users to make modules using that context, but that's not ideal because then you wouldn't be able to do multi-threaded compilation since the Context isn't thread-safe.
Yea, I expected as much, that's why I said we should tie it to the context lifetime somehow
It's not so simple to tie the OrcEngine to the context lifetime, though, because that would make it impossible to compile modules from multiple threads because all modules would need to be created with the same context, which isn't thread-safe. Having lifetimes attached also makes the types more difficult to use. I'm really leaning into the idea of putting ContextImpl into an Arc, and having Module and Builder own their own copies, then when OrcEngine needs to keep the Context alive longer than the JIT Stack, it could be added to an internal list to prevent the Context from being dropped before the OrcEngine. That would also eliminate the need to attach a lifetime to Module and Builder, making them more versatile. Then, when the user requests the context from the module, instead of calling the internal LLVM function to create a ContextRef, it could clone the Arc<ContextImpl> that it owns and create a new Context from that.
This would make it possible to compile Modules from multiple threads, and when some type needs to take ownership of a module and keep the context that created it alive, it can deconstruct the module and take its Arc<ContextImpl> to keep the context alive.
Context, Builder, and Module already do not implement Clone or Copy, so this doesn't really change much.
Edit: Also, I forgot to mention, we can use PhantomData<*const ()> to prevent types from being Send or Sync where we need to.
Side Note: I was just looking at Context and ContextRef, and I was just thinking that it would be much simpler if ContextImpl were made public, then have Context and ContextRef deref to ContextImpl so that all the functions don't need to be mirrored. Seems error prone.
We already only minimally support multithreaded usecase. Full support has never been there because LLVM's complexity is too difficult to try and map fully imo. Scoped threads are probably the best case if someone wants to do multithreading
Module also does impl Clone, because LLVM provides a fn for it at the LLVM level
Module also does impl Clone, because LLVM provides a fn for it at the LLVM level
Adding an Arc into the module wouldn't prevent it from being Cloneable, though. I shouldn't have specified Clone, I should have specified just Copy, because Arc prevents a type from being Copy.
Changing the entire crate to be Arc based is less than ideal, as it means most types are no longer simple to copy and move around without explicit clones everywhere
I just re-read this, and I realized that you may have misunderstood me. I don't want to make the entire crate Arc based, only Context. Sorry for the confusion.
Hmm, I suppose that might work. But it'd need special handling for the global context
Why would it need special handling? In regards to the global context, I'm not familiar with its purpose or potential usage.
I'd like to continue work on the Orc Engine, but I can't until I understand how I should handle the global context. I'm not sure what you mean by special handling.
I think because it's behind a lock?
I haven't looked at the code in a while since I've been working on other stuff, but last time I checked, it didn't look like switching to Arc internally would really change anything. It doesn't change how Context is used, created, or destroyed. The only thing that it would really change is modules and maybe builders since they could own a handle to the context to keep it alive. That would mean that you couldn't create modules/builders with a ContextRef since the ContextRef doesn't have access to the original context object that the Module/Builder would hold a handle to.
Perhaps we could differentiate by having an "unowned" version of Module/Builder that could be created from a ContextRef while another version of Module/Builder would own a handle to the Context. This could easily be achieved through the use of the Deref trait, which means that functionality could be reused from module type to module type.
@TheDan64, what exactly is the global context meant to be used for? I'd like to make the changes, but I just want to know how the global context is intended to be used so that I don't accidentally break its functionality. If it's meant to be used to make Modules, that could complicate things a bit if those Modules were used in a multi-threaded context. I think there could be problems with multi-threaded code that relies on a module built with the global context inside the Orc Engine.
But that could be documented. I don't really know how the Orc Engine interacts with the context that the module is built from.
I think it's just like a context llvm allocates by default, never dies, and is globally accessible
So yes, you could use it for anything a regular context is used for
Well, that certainly could complicate things with the Orc Engine, because the Orc Engine needs to be able to run multi-threaded code. But it only needs access to the Context while building the module. It would mean that only a single thread should be using the global context to build modules at any given time. But I suppose that's already a limitation of the global context.
I'll try to finish up the changes by Friday. Just for Orc V1. Then I'll start working on Orc V2. I wanted to do both of them at the same time, but it took longer than I expected to finish the first one, and V2 is even more complex, so will likely take me much longer.
Okay, I think I found a good middle ground between breaking backwards compatibility and keeping the context that the module was created from alive.
I've put ContextImpl inside of an Arc, which doesn't appear to break anything, and I have also made it so that you must pass the context that the module was made from into the add_module method of the OrcEngine.
pub fn add_module(
&self,
name: &str,
module: Module<'_>,
creation_context: &Context,
compilation_mode: CompilationMode,
locals: Option<SymbolTable<'_>>,
) -> Result<(), OrcError> {
self.internal_add_module(
name,
module,
creation_context,
compilation_mode,
locals.map(SymbolResolver::local).unwrap_or_else(SymbolResolver::none),
)
}
This should work, but it requires diligence from the user to not pass in the wrong context, or to have access to the context.
I believe that this is the best possible solution to the issue.
Another update: I said I would finish up today, but I guess that's not actually the case. It looks like I'll have to rework the symbol resolution API because I was wrong about how it works. I'm honestly regretting working on Orc V1, because I can tell that there are a lot of problems with it that are likely solved in Orc V2. It perhaps would have been better if inkwell never supported V1, but now it's too late because I've already written over ~~1000~~ 2000 lines of code for it.
On second thought, due to the quirks in Orc V1, it would perhaps be best if it weren't supported at all. I hate to say that because I've put weeks of work into this API, but at the same time I just don't think it's a good candidate for inkwell. It has too many problems, such as the symbol resolution quirks, or the context lifetime quirks. I think Orc V2 solves the context lifetime issue as well.
It's really up to you, though. If you want me to finish it up anyway, I can do so, but there will be inherent problems with the API that can't really be solved.
Global symbol resolution is broken, symbol resolution order is backwards from what I thought (FIFO instead of LIFO), and the context/module weirdness, and of course the eager compilation being broken on Windows.
I believe that Orc V2 solves these issues.
In terms of maintenance, I believe it would be better if we simply didn't support Orc V1 at all. It would make it much easier to support Orc V2.
I'm sorry to hear that. I'm fine with just supporting v2 if you don't mind tossing your hard work. I suspect it'll eventually get deprecated anyway if it isn't already.
Re: being diligent with contexts, unfortunately this has always been a problem. I wish rust had a way to generate unique types like it does with scoped lifetimes. But going that approach would make the api too unwieldy I think