dylib
dylib copied to clipboard
Support non-extern-C symbols
This is a C++ library for working with shared objects, but it only supports unmangled, C-style functions. That means it doesn't serve its primary function. The library must support any C++ function one can load from a shared object. Naturally, this is ABI-specific, but that's either for the user to configure and build accordingly, or potentially a case for multi-ABI support. The latter is much more complicated, and would be a feature request in itself, but function symbols should definitely be looked up by their mangled name, if they're not extern-C.
Current status
Hello,
I'm currently working on that.
The goal is to add a feature to dylib to be able to load c++ symbols
Linux and MacOS
Variables
I can now access a mangled variable within a namespace :
dylib lib("lib.so");
auto ver = lib.get_variable<double>("driver::infos::version");
Functions
To be able to mangle functions within a namespace, or / and in an overload situation, i need to have access to each function parameter types. But currently, the template you need to specify to get_function is the following :
dylib lib("lib.so");
// get_function<T> for T = [module *(const char *)]
auto mod = lib.get_function<module *(const char *)>("driver::factory");
To be able to iterate over variadic template arguments, i temporally replaced the current syntax with the following one :
// old syntax
// get_function<T>
auto mod = lib.get_function<module *(const char *)>("driver::factory");
// temporary new syntax
// get_function<Ret, Args...>
auto mod = lib.get_function<module *, const char *>("driver::factory");
Do you know if there is a way to "decompose" a function template argument to get its return value as Ret and its arguments as Args... ?
Windows
TODO
Update
Linux and MacOS
Variables
I can now access a mangled variable within a namespace :
dylib lib("lib");
auto ver = lib.get_variable<double>("driver::infos::version");
Functions
I can now access a mangled function within a namespace with any types of arguments :
dylib lib("lib");
auto mod = lib.get_function<module *, const char *>("driver::factory");
auto set_inst = lib.get_function<void, module &&>("driver::instance::set");
auto print = lib.get_function<void, std::ostream &, const std::string &>("driver::tools::print");
Windows
TODO (Next step)
Question
Do you know if there is a way to "decompose" a function template argument to get its return value as Ret and its arguments as Args... ?
Do you know if there is a way to "decompose" a function template argument to get its return value as Ret and its arguments as Args... ?
Well, std::result_of for the return type; and you can use this hack for the parameters.
But are you sure you're not going about this the wrong way? I mean, take the function's proper type, then apply name mangling (not yourself - there's an ABI library for that), then look for the symbol.
But are you sure you're not going about this the wrong way? I mean, take the function's proper type, then apply name mangling (not yourself - there's an ABI library for that), then look for the symbol.
This is what I'm doing but i'm not sure there is an abi lib to mangle names (i'm currently using typeid(T)::name() to apply mangle)
There is this abi function to demangle a symbol but i didn't see anything about re-mangling:
char *demangledName = abi::__cxa_demangle(av[i], NULL, NULL, &status);
Ah, right, abi:: is just for demangling. typeid(T)::name() doesn't need an extra library; but then - it doesn't mangle names in the sense of getting you the symbol name to look for in an object.
Also, this may be relevant for Windows.
typeid(T)::name() doesn't need an extra library; but then - it doesn't mangle names in the sense of getting you the symbol name to look for in an object.
You are right, to do so, i made the following code to have at the end the accurate function symbol mangled name in all situations (except pointers and namespaces) on unix :
template <typename T, typename U, typename... Args>
static std::string TemplateMangle()
{
return TemplateMangle<T>() + TemplateMangle<U, Args...>();
}
template <typename T>
static std::string TemplateMangle()
{
std::string t = typeid(T).name();
if (std::is_lvalue_reference<T>::value) {
std::string tmp = "R";
if (std::is_const<typename std::remove_reference<T>::type>::value)
tmp += 'K';
t = tmp + t;
}
else if (std::is_rvalue_reference<T>::value) {
std::string tmp = "O";
if (std::is_const<typename std::remove_reference<T>::type>::value)
tmp += 'K';
t = tmp + t;
}
return t;
}
template<typename ReturnType, typename Arg1, typename ...Args>
static std::string mangle_function(const std::string &name) {
return "_Z" + std::to_string(name.size()) + name + TemplateMangle<Arg1, Args...>();
}
template<typename ReturnType>
static std::string mangle_function(const std::string &name) {
return "_Z" + std::to_string(name.size()) + name + typeid(void).name();
}
Let me first note I've asked about this at StackOverflow.
Now, for your implementation.
- I think all of this code should be made
constexpr- since it's all information that we know at compile-time. - The
TandUin one of your function variants are ambiguous. Give them more specific names? - I suggest we don't use
std::string's, but rather astring_view(or achar*andsize_tpair in C++11) as the target buffer. We have rather expensive string concatenations in our code, that's true - but that only happens when handling errors. - ... actually, we may want to have a "poor man's span" structure with just those two fields
- Same point about the inputs. So, something like:
template <typename Function> mangle_function(dylib::detail_::poor_span<char> mangled_name, dylib::detail_::poor_span<char> function_name)` TemplateMangle- what exactly does it mangle? It seems like it mangles the name of a type, right? Then better call itmangle_type(). Or perhaps justmangle().- Don't use the same string literal in multiple places.
- You can probably have an outer
mangle()function with a single template parameter, like I suggested above - since it can distinguish at compile-time between whether it was asked to mangle a function or a variable, and call inner code - possibly with a different function name - as necessary. - Have you checked this against the Itanium ABI document to make sure it's valid? That should also get you going with namespace and pointers.
- What about mangling a variable?
I think all of this code should be made constexpr - since it's all information that we know at compile-time. The T and U in one of your function variants are ambiguous. Give them more specific names? I suggest we don't use std::string's, but rather a string_view (or a char* and size_t pair in C++11) as the target buffer. We have rather expensive string concatenations in our code, that's true - but that only happens when handling errors. ... actually, we may want to have a "poor man's span" structure with just those two fields Same point about the inputs. So, something like: template <typename Function> mangle_function(dylib::detail_::poor_span
mangled_name, dylib::detail_::poor_span function_name)` TemplateMangle - what exactly does it mangle? It seems like it mangles the name of a type, right? Then better call it mangle_type(). Or perhaps just mangle(). Don't use the same string literal in multiple places.
You're right, but currently I prefer to focus on making the proof of concept work
Have you checked this against the Itanium ABI document to make sure it's valid? That should also get you going with namespace and pointers.
Yes, i'm using this document to implement the feature
What about mangling a variable?
The following code mangles namespaced varibles on unix :
class dylib {
private:
static std::vector<std::string> string_to_vector(const std::string &str, const char *delimiters) {
std::vector<std::string> tokens;
std::string::size_type lastPos = str.find_first_not_of(delimiters, 0);
std::string::size_type pos = str.find_first_of(delimiters, lastPos);
while (std::string::npos != pos || std::string::npos != lastPos) {
tokens.push_back(str.substr(lastPos, pos - lastPos));
lastPos = str.find_first_not_of(delimiters, pos);
pos = str.find_first_of(delimiters, lastPos);
}
return tokens;
}
static std::string mangle_variable(const std::string &name) {
if (name.find("::") == std::string::npos)
return name;
auto ns_list = string_to_vector(name, "::");
if (ns_list.size() == 1)
return ns_list.front();
std::string mangled = "_ZN";
for (auto &ns : ns_list)
mangled += std::to_string(ns.size()) + ns;
return mangled + 'E';
}
}
I think you may be misusing the delimiters parameter... it takes several chars, each of which is a delimited.
I'm gonna release 2.0.0 without this remangling feature since i dont have many time to work on that actually.
@martin-olivier : There's always version 3.0...
Here are some outstanding SO questions about doing this:
Good news - here's MSVC mangling code for you:
https://godbolt.org/z/nnW19qzYE
Right now, that code requires C++20, but with a little work you can bring that down to C++11 and integrate it into yur own code.
Good news - here's MSVC mangling code for you:
https://godbolt.org/z/nnW19qzYE
Right now, that code requires C++20, but with a little work you can bring that down to C++11 and integrate it into yur own code.
Hey, did anyone did that "little work"? I kinda need it to build in some old compilers. :/