rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

Avoid using \u{1} prefixes in symbol names where possible

Open bjorn3 opened this issue 1 year ago • 9 comments

This prefix disables name mangling in LLVM, but other codegen backends don't support this. We also don't guarantee that it will remain working. As discussed in https://rust-lang.zulipchat.com/#narrow/stream/238009-t-compiler.2Fmeetings/topic/.5Bweekly.5D.202024-09-19/near/471499857 we may choose to add a #[link_name("foo", verbatim)] flag or similar to get the same effect and then deny all usage of \u{1} over an edition boundary.

bjorn3 avatar Sep 19 '24 14:09 bjorn3

See also https://github.com/rust-lang/rustc_codegen_cranelift/issues/1520 for an example where this causes linking to fail with cg_clif as the \u{1} prefix gets passed straight to the linker.

bjorn3 avatar Sep 19 '24 14:09 bjorn3

hey Bjorn :)

I'm all-in if we could replace this hacky solution for something less brittle. Is there something we could do before the #[link_name("foo", verbatim)] is actually implemented?

pvdrz avatar Sep 19 '24 20:09 pvdrz

Most of the time it should be possible to reverse the symbol mangling in rust-bindgen back to whatever would have been written in C before passing it to #[link_name] and thus avoid \u{1} most of the time. For the remaining cases where the symbol name doesn't match something that regular C mangling would produce, keeping usage of \u{1} until #[link_name("foo", verbatim)] is implemented is the best you can do I think.

bjorn3 avatar Sep 19 '24 20:09 bjorn3

Most of the time it should be possible to reverse the symbol mangling in rust-bindgen back to whatever would have been written in C

That's true for C, but not so easy for C++, right?

emilio avatar Sep 25 '24 19:09 emilio

Outside of Windows it is possible to unmangle C++ names into something that cam be written in C (except for ABI tags as those need a dot, but that works fine with #[linkage_name] without \x01 already anyway). And for Windows C++ it is fine by me to keep using \x01 for now if necessary.

bjorn3 avatar Sep 25 '24 19:09 bjorn3

This seems to be caused by the fact that aws-lc-rs uses generated_name_override/--prefix-link-name causing names_will_be_identical_after_mangling to return false as the item name and the symbol name it got from libclang are different.

bjorn3 avatar Mar 18 '25 15:03 bjorn3

This seems to fix it:

diff --git a/bindgen/codegen/mod.rs b/bindgen/codegen/mod.rs
index dd1486df..8603aaf5 100644
--- a/bindgen/codegen/mod.rs
+++ b/bindgen/codegen/mod.rs
@@ -4242,7 +4242,16 @@ impl CodeGenerator for Function {
                     Some(abi),
                 )
             {
-                attributes.push(attributes::link_name::<false>(link_name));
+                if utils::names_will_be_identical_after_mangling(
+                    self.mangled_name().unwrap(),
+                    link_name,
+                    Some(abi),
+                ) {
+                    attributes.push(attributes::link_name::<true>(link_name));
+                } else {
+                    attributes.push(attributes::link_name::<false>(link_name));
+                }
+
                 has_link_name_attr = true;
             }
         }

bjorn3 avatar Mar 18 '25 15:03 bjorn3

Any progress on this?

Leandros avatar Nov 21 '25 08:11 Leandros

I haven't continued working on this yet.

bjorn3 avatar Nov 21 '25 10:11 bjorn3