dylib icon indicating copy to clipboard operation
dylib copied to clipboard

Support non-extern-C symbols

Open eyalroz opened this issue 3 years ago • 16 comments

This is a C++ library for working with shared objects, but it only supports unmangled, C-style functions. That means it doesn't serve its primary function. The library must support any C++ function one can load from a shared object. Naturally, this is ABI-specific, but that's either for the user to configure and build accordingly, or potentially a case for multi-ABI support. The latter is much more complicated, and would be a feature request in itself, but function symbols should definitely be looked up by their mangled name, if they're not extern-C.

eyalroz avatar May 13 '22 12:05 eyalroz

Current status

Hello, I'm currently working on that. The goal is to add a feature to dylib to be able to load c++ symbols

Linux and MacOS

Variables I can now access a mangled variable within a namespace :

dylib lib("lib.so");
auto ver = lib.get_variable<double>("driver::infos::version");

Functions To be able to mangle functions within a namespace, or / and in an overload situation, i need to have access to each function parameter types. But currently, the template you need to specify to get_function is the following :

dylib lib("lib.so");

// get_function<T> for T = [module *(const char *)]
auto mod = lib.get_function<module *(const char *)>("driver::factory");

To be able to iterate over variadic template arguments, i temporally replaced the current syntax with the following one :

// old syntax
// get_function<T>
auto mod = lib.get_function<module *(const char *)>("driver::factory");

// temporary new syntax
// get_function<Ret, Args...>
auto mod = lib.get_function<module *, const char *>("driver::factory");

Do you know if there is a way to "decompose" a function template argument to get its return value as Ret and its arguments as Args... ?

Windows

TODO

martin-olivier avatar May 30 '22 09:05 martin-olivier

Update

Linux and MacOS

Variables I can now access a mangled variable within a namespace :

dylib lib("lib");
auto ver = lib.get_variable<double>("driver::infos::version");

Functions I can now access a mangled function within a namespace with any types of arguments :

dylib lib("lib");

auto mod = lib.get_function<module *, const char *>("driver::factory");
auto set_inst = lib.get_function<void, module &&>("driver::instance::set");
auto print = lib.get_function<void, std::ostream &, const std::string &>("driver::tools::print");

Windows

TODO (Next step)

Question

Do you know if there is a way to "decompose" a function template argument to get its return value as Ret and its arguments as Args... ?

martin-olivier avatar Jun 03 '22 10:06 martin-olivier

Do you know if there is a way to "decompose" a function template argument to get its return value as Ret and its arguments as Args... ?

Well, std::result_of for the return type; and you can use this hack for the parameters.

But are you sure you're not going about this the wrong way? I mean, take the function's proper type, then apply name mangling (not yourself - there's an ABI library for that), then look for the symbol.

eyalroz avatar Jun 03 '22 20:06 eyalroz

But are you sure you're not going about this the wrong way? I mean, take the function's proper type, then apply name mangling (not yourself - there's an ABI library for that), then look for the symbol.

This is what I'm doing but i'm not sure there is an abi lib to mangle names (i'm currently using typeid(T)::name() to apply mangle)

There is this abi function to demangle a symbol but i didn't see anything about re-mangling:

char *demangledName = abi::__cxa_demangle(av[i], NULL, NULL, &status);

martin-olivier avatar Jun 03 '22 22:06 martin-olivier

Ah, right, abi:: is just for demangling. typeid(T)::name() doesn't need an extra library; but then - it doesn't mangle names in the sense of getting you the symbol name to look for in an object.

eyalroz avatar Jun 04 '22 07:06 eyalroz

Also, this may be relevant for Windows.

eyalroz avatar Jun 04 '22 07:06 eyalroz

typeid(T)::name() doesn't need an extra library; but then - it doesn't mangle names in the sense of getting you the symbol name to look for in an object.

You are right, to do so, i made the following code to have at the end the accurate function symbol mangled name in all situations (except pointers and namespaces) on unix :

    template <typename T, typename U, typename... Args>
    static std::string TemplateMangle()
    {
        return TemplateMangle<T>() + TemplateMangle<U, Args...>();
    }

    template <typename T>
    static std::string TemplateMangle()
    {
        std::string t = typeid(T).name();
        if (std::is_lvalue_reference<T>::value) {
            std::string tmp = "R";
            if (std::is_const<typename std::remove_reference<T>::type>::value)
                tmp += 'K';
            t = tmp + t;
        }
        else if (std::is_rvalue_reference<T>::value) {
            std::string tmp = "O";
            if (std::is_const<typename std::remove_reference<T>::type>::value)
                tmp += 'K';
            t = tmp + t;
        }
        return t;
    }

    template<typename ReturnType, typename Arg1, typename ...Args>
    static std::string mangle_function(const std::string &name) {
        return "_Z" + std::to_string(name.size()) + name + TemplateMangle<Arg1, Args...>();
    }

    template<typename ReturnType>
    static std::string mangle_function(const std::string &name) {
        return "_Z" + std::to_string(name.size()) + name + typeid(void).name();
    }

martin-olivier avatar Jun 04 '22 09:06 martin-olivier

Let me first note I've asked about this at StackOverflow.

Now, for your implementation.

  • I think all of this code should be made constexpr - since it's all information that we know at compile-time.
  • The T and U in one of your function variants are ambiguous. Give them more specific names?
  • I suggest we don't use std::string's, but rather a string_view (or a char* and size_t pair in C++11) as the target buffer. We have rather expensive string concatenations in our code, that's true - but that only happens when handling errors.
  • ... actually, we may want to have a "poor man's span" structure with just those two fields
  • Same point about the inputs. So, something like:
    template <typename Function>
    mangle_function(dylib::detail_::poor_span<char> mangled_name, dylib::detail_::poor_span<char> function_name)`
    
  • TemplateMangle - what exactly does it mangle? It seems like it mangles the name of a type, right? Then better call it mangle_type(). Or perhaps just mangle().
  • Don't use the same string literal in multiple places.
  • You can probably have an outer mangle() function with a single template parameter, like I suggested above - since it can distinguish at compile-time between whether it was asked to mangle a function or a variable, and call inner code - possibly with a different function name - as necessary.
  • Have you checked this against the Itanium ABI document to make sure it's valid? That should also get you going with namespace and pointers.
  • What about mangling a variable?

eyalroz avatar Jun 04 '22 17:06 eyalroz

I think all of this code should be made constexpr - since it's all information that we know at compile-time. The T and U in one of your function variants are ambiguous. Give them more specific names? I suggest we don't use std::string's, but rather a string_view (or a char* and size_t pair in C++11) as the target buffer. We have rather expensive string concatenations in our code, that's true - but that only happens when handling errors. ... actually, we may want to have a "poor man's span" structure with just those two fields Same point about the inputs. So, something like: template <typename Function> mangle_function(dylib::detail_::poor_span mangled_name, dylib::detail_::poor_span function_name)` TemplateMangle - what exactly does it mangle? It seems like it mangles the name of a type, right? Then better call it mangle_type(). Or perhaps just mangle(). Don't use the same string literal in multiple places.

You're right, but currently I prefer to focus on making the proof of concept work

Have you checked this against the Itanium ABI document to make sure it's valid? That should also get you going with namespace and pointers.

Yes, i'm using this document to implement the feature

What about mangling a variable?

The following code mangles namespaced varibles on unix :

class dylib { 
private:
    static std::vector<std::string> string_to_vector(const std::string &str, const char *delimiters) {
        std::vector<std::string> tokens;
        std::string::size_type lastPos = str.find_first_not_of(delimiters, 0);
        std::string::size_type pos = str.find_first_of(delimiters, lastPos);
        while (std::string::npos != pos || std::string::npos != lastPos) {
            tokens.push_back(str.substr(lastPos, pos - lastPos));
            lastPos = str.find_first_not_of(delimiters, pos);
            pos = str.find_first_of(delimiters, lastPos);
        }
        return tokens;
    }

    static std::string mangle_variable(const std::string &name) {
        if (name.find("::") == std::string::npos)
            return name;
        auto ns_list = string_to_vector(name, "::");
        if (ns_list.size() == 1)
            return ns_list.front();
        std::string mangled = "_ZN";
        for (auto &ns : ns_list)
            mangled += std::to_string(ns.size()) + ns;
        return mangled + 'E';
    }
}

martin-olivier avatar Jun 05 '22 16:06 martin-olivier

I think you may be misusing the delimiters parameter... it takes several chars, each of which is a delimited.

eyalroz avatar Jun 05 '22 17:06 eyalroz

I'm gonna release 2.0.0 without this remangling feature since i dont have many time to work on that actually.

martin-olivier avatar Jun 11 '22 15:06 martin-olivier

@martin-olivier : There's always version 3.0...

eyalroz avatar Jun 11 '22 21:06 eyalroz

Good news - here's MSVC mangling code for you:

https://godbolt.org/z/nnW19qzYE

Right now, that code requires C++20, but with a little work you can bring that down to C++11 and integrate it into yur own code.

eyalroz avatar Jun 21 '22 11:06 eyalroz

Good news - here's MSVC mangling code for you:

https://godbolt.org/z/nnW19qzYE

Right now, that code requires C++20, but with a little work you can bring that down to C++11 and integrate it into yur own code.

Hey, did anyone did that "little work"? I kinda need it to build in some old compilers. :/

ericoporto avatar May 21 '24 09:05 ericoporto