inkwell
inkwell copied to clipboard
Segmentation fault when calling Builder::build_global_string
When playing around with inkwell before starting my project in full, I decided to try to make a hello world program using it. This is the code that I came up with
extern crate inkwell;
use inkwell::context::Context;
use inkwell::targets::{InitializationConfig, Target};
use inkwell::AddressSpace;
use inkwell::module::Linkage;
fn main() {
Target::initialize_native(&InitializationConfig::default()).unwrap();
let ctx = Context::create();
let module = ctx.create_module("program");
let builder = ctx.create_builder();
let str_type = ctx.i8_type().ptr_type(AddressSpace::Generic);
let i32_type = ctx.i32_type();
let global = builder.build_global_string("Hello, World!", "message");
let printf_type = i32_type.fn_type(&[&str_type], true);
let printf = module.add_function("printf", &printf_type, Some(&Linkage::ExternalLinkage));
let main_fn_type = i32_type.fn_type(&[&i32_type, &str_type.ptr_type(AddressSpace::Generic)], false);
let main_fn = module.add_function("main", &main_fn_type, None);
let block = ctx.append_basic_block(&main_fn, "entry");
builder.position_at_end(&block);
builder.build_call(&printf, &[&global], "", false);
builder.build_return(Some(&i32_type.const_int(0, false)));
println!("{:?}", module.print_to_string());
}
It seems like it should work, however there's a segmentation fault when I call builder.build_global_string. The same happens with builder.build_global_string_ptr
Environment
Operating system: Arch Linux Kernel version: 4.15.3-1-ARCH LLVM version: 5.0.1-2 Inkwell version: 0.1.0#eafca272 rustc -vV output:
rustc 1.25.0-nightly (3ec5a99aa 2018-02-14)
binary: rustc
commit-hash: 3ec5a99aaa0084d97a9e845b34fdf03d1462c475
commit-date: 2018-02-14
host: x86_64-unknown-linux-gnu
release: 1.25.0-nightly
LLVM version: 6.0
Hi, thanks for reporting this issue!
I'm not quite sure why this is happening - it seems like rustc doesn't allow for a name field at all and just passes in a null ptr but trying that didn't help me either. I will have to dig deeper into it.
Also worth noting that Inkwell only supports LLVM 3.7 officially at the moment (I have a PR to get multiple version support going, but it's turning out to be a massive pain), though regardless I am seeing this issue in 3.7 as well.
Sorry that this is blocking you! In the mean time I bet you could create a global i8 pointer to achieve the same effect? A C/LLVM string is just that (and maybe null terminated)
That would work, though how would I fill the global with my intended value? Glancing at the API docs it seems that there's a barrier between rust types like str, and values created from them, and anything in inkwell that has an impl of BasicValue. So I'm not sure how I would translate between the two
I think something like this should work:
let context = Context::create();
let module = context.create_module("my_mod");
let my_str = "Hello, World";
let i8_type = context.i8_type();
let i8_array_type = i8_type.array_type(my_str.len() as u32);
let global_string = module.add_global(&i8_array_type, Some(AddressSpace::Generic), "message");
let mut chars = Vec::with_capacity(my_str.len()); // Edit: Not sure if len is unicode chars or bytes, but I'm assuming it's bytes here
for chr in my_str.bytes() {
chars.push(i8_type.const_int(chr as u64, false));
}
let const_str_array = i8_array_type.const_array(chars.as_ref());
global_string.set_initializer(&const_str_array);
Alternatively, you could put it behind an i8_ptr_type
but then you should be sure to to have a null byte at the end so you know when the pointer ends. (This is essentially a C String - the above example is closer to what rust does)
It seems like I can't use an array because the signature of printf as I have it at the moment expects variadic i8*, and that type doesn't match [i8 x 13]*. I know how I can get a null PointerValue, but how can I fill that with the actual string I'm trying to work with?
In my edit to my previous comment, I mentioned you could just use almost identical code but with an i8_pointer_type, and as long as you be sure to add a null byte('\0'
) it should work. Might need to do a store or something, though. I'll try and whip up an example tomorrow
Okay, so, I don't have a full example but I updated the build_cast method after reading some LLVM docs and you should be able to cast an array pointer to an i8 pointer:
let i8_ptr_type = builder.build_cast(InstructionOpcode::BitCast, &const_str_array, &i8_ptr_type);
Though I think const_str_array
is a [n x i8]
and needs to be a [n x i8]*
in which case you might need to go one pointer level higher when creating it
@Redrield This seems to be an LLVM specific issue. This happens because the global string needs to be created in a function (I don't really get why, since it's global, not local...) but see this SO post which seems to showcase an identical problem: https://stackoverflow.com/questions/8966890/createglobalstringptr-crashes-when-i-dont-create-method-basic-block-for-main
I'm going to spend some more time thinking about whether or not there's anything in inkwell we can do about it... but I might just have to end up closing this issue as it's LLVM specific (Maybe I'll end up collecting a list of LLVM gotchas...). To get the code in your initial post you'd have to move the global string creation to after the main function is generated like so:
fn main() {
Target::initialize_native(&InitializationConfig::default()).unwrap();
let ctx = Context::create();
let module = ctx.create_module("program");
let builder = ctx.create_builder();
let str_type = ctx.i8_type().ptr_type(AddressSpace::Generic);
let i32_type = ctx.i32_type();
let printf_type = i32_type.fn_type(&[&str_type], true);
let printf = module.add_function("printf", &printf_type, Some(&Linkage::ExternalLinkage));
let main_fn_type = i32_type.fn_type(&[&i32_type, &str_type.ptr_type(AddressSpace::Generic)], false);
let main_fn = module.add_function("main", &main_fn_type, None);
let block = ctx.append_basic_block(&main_fn, "entry");
builder.position_at_end(&block);
let global = builder.build_global_string("Hello, World!", "message");
builder.build_call(&printf, &[&global], "", false);
builder.build_return(Some(&i32_type.const_int(0, false)));
println!("{:?}", module.print_to_string());
}
Maybe Inkwell should require a function value to be passed into the create string global method even if it isn't actually used... Definitely would need to document that
I ended up just stealing the stuff from librustc_trans which works. Seems there they associate builders with a block directly. The way a builder is constructed makes it linked with the block it is building for, and they make new builders per function. Maybe an architectural change to that would be of use here?
Dunno, seems odd since you can have multiple blocks per function, but maybe. I'll have to take a look at librustc_trans.
It looks like librustc_trans also supports creating a builder without a function, hmm: https://github.com/rust-lang/rust/blob/master/src/librustc_codegen_llvm/builder.rs#L61-L84
@TheDan64 "This seems to be an LLVM specific issue. This happens because the global string needs to be created in a function" The global needs to be created in a Module, though not a Function. The way the IRBuilder finds the module is by looking at the IRBuilder's current position and walking parents from BasicBlock->Function->Module. When you haven't set the position yet, it will perform undefined behaviour in C++ land as it starts with no basic block.
IMO it's pretty uncommon to use an IRBuilder without setting its position first. If you want to support it, the builder will create detached instructions (see #93) and only some of its methods are safe for that. All the type and constant (non-global) creation are. Some, like, build_malloc and build_global_string won't be. Or you could just mandate that IRBuilders must have a position at all times instead of trying to split the functions by whether they support creating detached IR.
Unfortunately, either way, you can't use the IRBuilder's utility functions like creating a global string until after you've created a basic block somewhere in the module you want to create your string in. (Really, LLVM should split this up and expose the utility for creating a string in a module, and then have IRBuilder use that.)