rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

Generate bindings to explicitly instantiated function templates

Open fitzgen opened this issue 7 years ago • 9 comments

We can't invoke the C++ compiler to generate new instantiations of templates, so I don't think we even try to keep track of them currently.

However, we could track explicit instantiations of template functions (14.7.2 in the standard) and make bindings to those.

For example, if given

template<class T> void foo(T t) { /∗ ... ∗/ }
template void foo(char);
template void foo(int);

We could generate something like:

extern "C" {
    #[link_name = "..."]
    pub fn foo_char(char);

    #[link_name = "..."]
    pub fn foo_int(int);
}

But either way, we can't create new template instantiations, so we shouldn't even try to generate bindings to generic functions based on the template, or to instantiations for which we haven't seen an explicit instantiation.

fitzgen avatar Feb 07 '17 23:02 fitzgen

Perhaps foo1, foo2, ... would be better than trying to mangle the template argument type name into the symbol. As the author of a C++ symbol demangling library, I can attest that mangling is more complicated than it seems at first blush.

fitzgen avatar Feb 08 '17 17:02 fitzgen

Perhaps foo1, foo2, ... would be better than trying to mangle the template argument type name into the symbol

So long as foo2 consistently refers to foo<int> (or whatever), even if the source gets rearranged or foo1/foo<char> gets deleted or replaced.

I suppose it doesn't matter that much if the rearrangement always causes a compile failure on the Rust side, since editing the code should be straightforward if we assume that the C++ FFI code is constrained to a small area which has stable interfaces to everything else in Rustland.

jsgf avatar Mar 17 '17 18:03 jsgf

I'm running into a situation where a library declares some template functions in .h without defining them, then defines and instantiates the known monomorphizations in .cpp (one that uses float and another using double). Since other .cpp files in that same library can refer to the header file without needing them to be instantiated, I would assume that Rust-usable bindings could be generated for them.

Things I've tried:

  • Forward declaring specific functions in wrapper.hpp like: void myFun(float* arr); void myFun(double* arr); (without template keyword): Generates 2 bindings per myFun with specific function names, linker error running tests - undefined reference to (name of function) - implies that the link_name generated in bindings.rs is wrong.
  • Same thing, but prefixed with template keyword: Fails generating bindings: multiple wrapper.hpp:47:15: error: explicit instantiation of undefined function template 'myFun', err: true
  • Same thing, but now prefixed with extern template - fails to generate any bindings for those functions, and subsequently fails rust compilation due to unresolved imports.

jeffvandyke avatar Aug 08 '19 13:08 jeffvandyke

Forward declaring specific functions in wrapper.hpp like: void myFun(float* arr); void myFun(double* arr); (without template keyword): Generates 2 bindings per myFun with specific function names, linker error running tests - undefined reference to (name of function) - implies that the link_name generated in bindings.rs is wrong.

That doesn't work, it's not the same and as those functions aren't defined the linker errors are expected.

Same thing, but prefixed with template keyword: Fails generating bindings: multiple wrapper.hpp:47:15: error: explicit instantiation of undefined function template 'myFun', err: true

That's a clang error, which means that it's not valid C++. You should be able to explicitly instantiate some of them, but it seems that by the time you get there myFun is not defined, so you also need to include the file that defines the template.

That would still not work, but is fixable. That's what this issue is about. I'm happy to mentor it.

Same thing, but now prefixed with extern template - fails to generate any bindings for those functions, and subsequently fails rust compilation due to unresolved imports.

I think that would also work with this issue fixed. It just needs to be fixed.

emilio avatar Aug 09 '19 11:08 emilio

Thanks, that sounds like the cleanest solution. It look like (and it seems reasonable) instantiated function templates just have a different linker name than an identical free function with the same function name and type signature (the reason my option 1 failed).

I'm not sure (yet) exactly how bindgen deals with different compilers, but while we could inspect source code for how gcc and clang mangle instantiated template functions (with extern template without definition, or template with definition), is it known how MSVC mangles them?

jeffvandyke avatar Aug 09 '19 12:08 jeffvandyke

I suppose in the meantime, including a .cpp file that declares c-bindable functions that use the template functions might even result in the same library code, given inlining, and have no overhead when used from Rust.

jeffvandyke avatar Aug 13 '19 19:08 jeffvandyke

Correct me if I'm wrong, but I think this can't be implemented in bindgen at the moment due to a limitation of libclang.

I believe there is currently no cursor kind for template instantiations in libclang. When looking at the clang AST, you get the following for a function template declaration:

`-FunctionTemplateDecl 0x540ce0 <line:14:1, col:35> col:24 foo
  |-TemplateTypeParmDecl 0x540aa0 <col:10, col:16> col:16 referenced class depth 0 index 0 T
  |-FunctionDecl 0x540c38 <col:19, col:35> col:24 foo 'void (T)'
  | |-ParmVarDecl 0x540b40 <col:28, col:30> col:30 t 'T'
  | `-CompoundStmt 0x540e00 <col:33, col:35>
  |-FunctionDecl 0x55e528 <col:19, col:35> col:24 foo 'void (char)'
  | |-TemplateArgument type 'char'
  | | `-BuiltinType 0x4f8fb0 'char'
  | `-ParmVarDecl 0x55e460 <col:28, col:30> col:30 t 'char':'char'
  `-FunctionDecl 0x55e868 <col:19, col:35> col:24 foo 'void (int)'
    |-TemplateArgument type 'int'
    | `-BuiltinType 0x4f9010 'int'
    `-ParmVarDecl 0x55e7a8 <col:28, col:30> col:30 t 'int':'int

This includes the explicit instantiations of the foo function as FunctionDecls. However, when dumping the libclang CXCursor, we get something that looks like this:

(FunctionTemplate
    (TemplateTypeParameter)
    (ParmDecl
        (TypeRef)
    )
    (CompoundStmt)
)

The template instantiations are not included in the children, in contrast to the clang AST.

If I'm right about this, we would need to propose changes to libclang before support for explicitly instantiated function templates can be implemented in bindgen.

Perhaps slightly off-topic, but would it theoretically be possible to support explicitly instantiated method templates?

For example, given this:

template <typename T>
class Test {
public:
  int bar(T x);
};

extern template int Test<char>::bar(char x);
extern template int Test<int>::bar(int x);

I think we might be able to generate something like this (although it might require a more sophisticated mangling scheme):

#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct test<T> {
    pub t: T,
    pub _phantom_0: ::std::marker::PhantomData<::std::cell::UnsafeCell<T>>,
}

extern "C" {
    #[link_name = "<test char bar mangled name>"]
    pub fn test_char_bar(
        this: *mut test<u8>,
        x: u8,
    ) -> u8;
    
    #[link_name = "<test int bar mangled name>"]
    pub fn test_int_bar(
        this: *mut test<c_int>,
        x: ::std::os::raw::c_int,
    ) -> ::std::os::raw::c_int;
}

impl test<u8> {
    pub unsafe fn bar(
        &mut self, 
        x: u8
    ) -> u8 {
        test_char_bar(self, x)
    }
}

impl test<c_int> {
    pub unsafe fn bar(
        &mut self, 
        x: c_int,
    ) -> c_int {
        test_int_bar(self, x)
    }
}

This would require similar support from libclang. A proposal for this has already been made: https://reviews.llvm.org/D43763. Internally this is considered template specialization in clang, but despite the lack of such specialization in Rust, I don't think there is a problem as long there are no specializations that change the fields of the class like in https://github.com/rust-lang/rust-bindgen/issues/24.

Danacus avatar Sep 26 '23 14:09 Danacus

from the rust side there's a limitation due to the lack of specialization but we should be able to support a small subset of it as you mention. Sadly i'm not that knowledgeable of C++ templates and what would be needed to properly support this on bindgen.

pvdrz avatar Sep 27 '23 19:09 pvdrz

I can't say I'm very knowledgable of C++ templates myself either. I'm mostly just curious of what's possible.

I do wonder now if the lack of specialization is necessarily a problem for function/method templates. Consider the example from #24 with some methods added:

template <typename T>
class Test {
  int bar(char x);
};

template<>
class Test<int> {
  int foo;

public:
  int bar_a(int x);
};

template<>
class Test<float> {
  float foo;

public:
  int bar_b(float x);
};

bindgen would currently generate a single type that represents the generic one, and the fields from the specializations would not be accessible. However, I don't think that's a problem for the methods, since you could still generate something like this, unless I'm mistaken:

#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct Test<T> {
    pub _address: u8,
    pub _phantom_0: ::std::marker::PhantomData<::std::cell::UnsafeCell<T>>,
}

extern "C" {
    // bindings for methods of all instantiations with mangled link names
}

impl<T> Test<T> {
    fn bar(&mut self, x: u8) -> i32 { /* ... */ }
}

impl Test<i32> {
    fn bar_a(&mut self, x: i32) -> i32 { /* ... */ }
}

impl Test<f32> {
    fn bar_b(&mut self, x: f32) -> i32 { /* ... */ }
}

But I can tell that this only really works in limited cases. If bar, bar_a and bar_b would have the same signature int bar(int x); , there would be duplicate definitions. Maybe it would make sense to use a trait for the bindings, e.g.

pub trait Bar {
    fn bar(&mut self, x: i32) -> i32;
}

impl<T> Bar for Test<T> { /* ... */ }
impl Bar for Test<i32> { /* ... */ }
impl Bar for Test<f32> { /* ... */ }

which would not be valid due to the lack of impl specialization in rust, and it's probably not something bindgen should ever generate anyway.

I'm sorry if my brainstorming here is a bit off-topic. I do feel like there is indeed a subset that could be supported, which might be interesting, but I can't really define what this subset would be exactly due to how complex C++ templates are and how little I know about them. Perhaps any non-specialized template class, method or function could be supported in theory?

Danacus avatar Sep 28 '23 10:09 Danacus