Crash when implementing a new function to get parent of a global value
Hi,
Describe the Bug
I'm currenlty implement a function get_parent for the GlobalValule in global_value.rs:
Here is my implementation:
pub fn get_parent(self) -> Module<'ctx> {
unsafe {
let m = LLVMGetGlobalParent(self.as_value_ref());
Module::new(m)
}
}
However, when running a code to call to this new function, the program always crashes with the error message:
(signal: 11, SIGSEGV: invalid memory reference)
The crash is done by the line: Module::new(m):
I suspect that the error is due to the data structure Module, which stores module in a Cell:
https://github.com/sbip-sg/llvm-rust/blob/70a425a2e5b56ab6dc2bfd75b9145028558c130f/inkwell/src/module.rs#L181-L201
pub struct Module<'ctx> {
data_layout: RefCell<oOption<DataLayout>>,
pub(crate) module: Cell<LLVMModuleRef>,
pub(crate) owned_by_ee: RefCell<Option<ExecutionEngine<'ctx>>>,
_marker: PhantomData<&'ctx Context>,
}
Can you advise how to implement such a get_parent function to avoid crashes?
To Reproduce
Not reproducible with the current code, since it's my own implementation.
Expected Behavior
We should allow Inkwell to return the parent Module of a GlobalValue.
LLVM Version (please complete the following information):
- LLVM Version: LLVM 14
- Inkwell Branch Used: llvm-14
Desktop (please complete the following information):
- OS: Linux Mint 21 (Ubuntu 22 - based)
I think this might be a much more complex problem than meets the eye since you'd need to know if the model is owned by an EE or not and I'm not sure that there's a good way to do that atm Furthermore, is LLVMGetGlobalParent returning a null ptr? You'd have to return None in that case. Not sure if that's possible here.
Thank you for the reply.
FYI, for the invalid memory reference error (SIGSEGV), I just added an assertion in the unit test function test_globals (line 952)
assert_eq!(global.get_parent(), module);
https://github.com/sbip-sg/llvm-rust/blob/main/inkwell/tests/all/test_values.rs#L952
This unit test doesn't involve any EE, but somehow the memory error can still occur.
It's likely that both the original module and the get_parent module are calling Drop leading to a double free
This is exactly the problem. If I I comment the code that check for empty EE before dropping the module, then all the unit tests can pass.
// Module owns the data layout string, so LLVMDisposeModule will deallocate it for us.
// which is why DataLayout must be called with `new_borrowed`
impl Drop for Module<'_> {
fn drop(&mut self) {
// if self.owned_by_ee.borrow_mut().take().is_none() {
// unsafe {
// LLVMDisposeModule(self.module.get());
// }
// }
// Context & EE will drop naturally if they are unique references at this point
}
}
Is it ok to remove that code?
No - that code needs to be there. I'm not sure that we can support LLVMGetGlobalParent at this time. Or we need to come up with a creative workaround.
Maybe one of solution is introducing generic that represents state into Module like:
struct Module<'ctx, State> {
/*omit*/
_phantom_state: PhantomData<State>,
}
then, we provide type-struct to toggle wether call LLVMDisposeModule or not like:
struct Origin;
struct Alias;
impl Drop for Module<'_, Origin> {
fn drop(&mut self) {
if self.owned_by_ee.borrow_mut().take().is_none() {
unsafe {
LLVMDisposeModule(self.module.get());
}
}
}
}
impl Drop for Module<'_, Alias> {
fn drop(&mut self) {}
}
This enables compile-time checking for these dropping behavior with:
impl Context {
pub fn create_module(&self, name: &str) -> Module<Origin> {
/* omit */
}
}
impl<'ctx> GlobalValue<'ctx>{
pub fn get_parent(self) -> Module<'ctx, Alias> {
/* omit */
}
}
It's better than introducing "dropping" flag into Module since compiler never allows us to mistakenly set invalid flag in methods like get_parent (but very API breaking).
This issue is closely related to #343 as it's also SEGV issue with already disposed Modules. Thus, maybe we also need lifetime refers to "original" Module to reduce chances where bad references are exposed to users, in compile-time.
When these two features introduced, API maybe like:
impl<'ctx, State> Module<'ctx, State>{
pub fn add_global(&self, /* omit */) -> GlobalValue<'ctx, 'module> {
/* omit */
}
}
impl<'ctx, 'module> GlobalValue<'ctx, 'module>{
pub fn get_parent(self) -> Module<'ctx, Alias<'module>> {
/* omit */
}
}
Then we can prevent (but I'm not confident it's enough):
- Double free in
DropofModule - Trying to look into already dropped memory (for example trying to print the content of
GlobalValueorFunctionFaluealive longer than originalModulelike #343)
@ytoml: Yes, I also encountered a similar issue when implementing the get_parent for FunctionValue on a customized LLVM. So the current problem is that the lifetime of FunctionValue and GlobalValue somehow not consistent with Module
I notice that from the BasicBlock struct, we can call a similar function get_parent without any issue:
https://github.com/taquangtrung/inkwell//blob/master/src/basic_block.rs#L73
But the BasicBlock data structure is much simpler than Module.
For your proposal, I'm not so sure, but it looks complicated when using 2 lifetime annotations.
Looking forward to discussing more with you and @TheDan64
Actually, BasicBlock can produce SEGV like:
let ctx = Context::create();
// basic_block lives longer than module.
let basic_block =
{
let module = ctx.create_module("test");
let fn_type = ctx.f64_type().fn_type(&[], false);
let fn_value = module.add_function("test_fn", fn_type, None);
ctx.append_basic_block(fn_value, "entry")
};
// Segmentation fault
let _fn_value = basic_block.get_parent();
Although BasicBlock::get_parent() indend to distinguish whether parent is exist or not, it seems to touch already deallocated memory...
Maybe it's same as this issue.
But dropping only FunctionValue where BasicBlock belongs to is OK:
let ctx = Context::create();
let module = ctx.create_module("test");
let basic_block =
{
let fn_type = ctx.f64_type().fn_type(&[], false);
let fn_value = module.add_function("test_fn", fn_type, None);
ctx.append_basic_block(fn_value, "entry")
};
let fn_value = basic_block.get_parent();
println!("{fn_value:?}"); // Some(...), because module is still alive.
drop(module);
println!("{fn_value:?}"); // Segmentation Fault
@taquangtrung I agree with you that multiple lifetime specifiers are complex. Actually single lifetime like this seems to work (I mistakenly think this cannot be compiled when writing comment above 😂 ):
impl<'ctx> Module<'ctx> {
pub fn add_function<'module>(&'module self, name: &str, ty: FunctionType<'ctx>, linkage: Option<Linkage>) -> FunctionValue<'module> {
/* omit */
let fn_value = unsafe { FunctionValue::new(/* omit */) }; // This FunctionValue can be any lifetime('static)
/* omit */
fn_value // limit lifetime to 'module rather than 'ctx
}
}
In this case, 'module guarantees lifetime equal to or shorter than original 'ctx (as long as implementation is correct).
Using the generic state vs origin types wont work. They'll solve the double free but not general invalid access if you drop the origin before state.
for FunctionValue/GlobalValue::get_parent to be safe at all, they'd have to keep a copy of the module around with them. I'm hesitant to do so for a number of reasons, including the fact that BasicBlock::get_parent would have no way to get a copy of the module since BasicBlocks are derived from Contexts not Modules. Personally I think we should just make the get_parent methods unsafe. I don't really see a great solution here atm
It's worth noting that this isn't totally a new issue. We've had the same problem with get_context which we tried to solve using a ContextRef<'ctx> type:
https://github.com/TheDan64/inkwell/blob/8221294448ad0810ae014c8f8e1aff96ddf47ecd/src/types/array_type.rs#L94
https://github.com/TheDan64/inkwell/blob/master/src/context.rs#L1043
While this does solve the lifetime issue for the generated context, it has some undesirable side effects, such as that types created with the context ref have a shorter than normal lifetime (connected to the contextref, not the original context). Creating a ModuleRef type would suffer from the same problem but might be a stop-gap solution if the calls it's used for are short lived
Actually, now that I'm thinking about it again, a way to fix ContextRef's problem would be to duplicate all of the Context methods onto ContextRef and tweak the lifetimes on them. This might work but would be a lot of code duplication..
Using the generic state vs origin types wont work. They'll solve the double free but not general invalid access if you drop the origin before state.
I think aliased Module lives no longer than original as it derives from FunctionValue/GlobalValue binded to original's lifetime (rather than context):
impl Context {
pub fn create_module(&self, name: &str) -> Module<Origin> {
/* omit */
}
}
impl<'ctx, State> Module<'ctx, State /* Origin or Alias */> {
pub fn add_function<'module>(&'module self, /* omit */) -> FunctionValue<'module> {
/* omit */
}
}
impl <'module> FunctionValue<'module> {
pub fn get_parent(self) -> Module<'module, Alias> {
/* omit */
}
}
Maybe it works? (even yes, it still leaves problem in BasicBlock...)
@TheDan64 @ytoml: I think BasicBlock has nothing to do with Module, since BasicBlock::get_parent() returns the parent FunctionValue
I notice the way Module and FunctionValue are defined are quite different: the field module is put into a Cell, while fn_valule is not (like below). Not sure if this difference makes handling Module more difficult?
#[derive(Debug, PartialEq, Eq)]
pub struct Module<'ctx> {
data_layout: RefCell<Option<DataLayout>>,
pub(crate) module: Cell<LLVMModuleRef>,
pub(crate) owned_by_ee: RefCell<Option<ExecutionEngine<'ctx>>>,
_marker: PhantomData<&'ctx Context>,
}
#[derive(PartialEq, Eq, Clone, Copy, Hash)]
pub struct FunctionValue<'ctx> {
fn_value: Value<'ctx>,
}
I think BasicBlock has nothing to do with Module, since BasicBlock::get_parent() returns the parent FunctionValue
yes but BasicBlock::get_parent would have to have module lifetime and/or pointer knowledge to create the FunctionValue, which it does not
I think aliased Module lives no longer than original as it derives from FunctionValue/GlobalValue binded to original's lifetime (rather than context):
The generic type is no longer helpful once you base the lifetime on the module. Also, a slight problem is that it'll have to keep track of both context and module lifetimes, to return context related data. But could work
The generic type is no longer helpful once you base the lifetime on the module.
Maybe it's still usable because the main reason I proposed status type generics was to prevent double freeing in Drop. Module lifetime just stands for invalidating GlobalValues or FunctionValues after original Module is dropped, and it still occurs double freeing problem when aliasing Modules are instanciated (even with proper lifetime, Drop will be called more than once).
I think I've sketched out a fix for this using two lifetimes but I'm considering punting it until 0.2.0 since it'll be a fairly big breaking change
WIP branch: https://github.com/TheDan64/inkwell/compare/module-lifetimes