Generating Rust bindings which include a type named String doesn't compile
Describe the bug When trying to generate bindings to a type with the name String I get the error
Error: × cxx couldn't handle our generated bindings - could be a bug in autocxx:
│ unsupported unique_ptr target type
In my particular case the String type is being incidentally generated because it is being returned from another function. But it will reproduce when directly generating the type named String as well.
To Reproduce
I created a minimized example which reproduces this behaviour with a header named test.h
#pragma once
#include <memory>
#include <stdint.h>
namespace tester {
struct String {
char *data;
uint32_t len;
};
}
class Indirect {
public:
tester::String data;
std::unique_ptr<tester::String> datum() {
return std::make_unique<tester::String>(data);
}
};
use autocxx::prelude::*;
include_cpp! {
#include "test.h"
safety!(unsafe)
generate!("Indirect")
}
fn main() {}
Additional notes
I did some initial digging and found that the cxxbridge that autocxx is generating contains
impl UniquePtr<String> {}
impl SharedPtr<String> {}
impl WeakPtr<String> {}
impl CxxVector<String> {}
without the corresponding type String = super::bindgen::root::tester::String; declaration (which appears later). If these lines are removed then compilation will fail with cxx0.h containing a function conversion which attempts to return a rust::String instead of tester::String.
I am happy to continue to explore/fix this problem, but I would like some guidance on where I should look to find the String -> rust::String conversion machinery.
Please file a test case PR following these instructions.
Unfortunately all the stuff around names in autocxx is very fragile so this might not be an easy fix.
Added a test in https://github.com/google/autocxx/pull/1402
I was able to create a simpler reproducible test case, I believe it's just name conflicts with the default prelude, using a different struct name other than String should generate bindings without problems
Is there a workaround for this if it's not possible to rename the String class we're binding to?
This is severely blocking progress for me and I haven't found a great workaround. I was able to write manual bindings for the String class in question, but I'm still having issues binding to classes which take or return Strings as arguments or return types.
I'm wondering if making it possible to specify an alternate name in the generate! macro call might be a workaround for cases like this, e.g. generate!("myns::String", "MyString"). Then, in the generated cxx::bridge, the original name could be specified with #[cxx_name = "String"].
I'd accept a PR to fix this. To be honest just having a PR which I can use to reproduce the problem in a test case might be enough for me to take a look. #1402 is the right sort of thing but I can't merge it due to the contributor not signing the (admittedly annoying) Google CLA.
There's a 50/50 chance that this is an easy fix, versus a bit of a nightmare which requires revisiting all the fragile naming stuff that I've been failing to rework for years.
I've actually just come up with a decent workaround by combining the following: manual bindings to my myns::String class using cxx, blocking autocxx from generating bindings with block!("myns::String"), and for some reason I needed to manually implement cxx::ExternType for one of my other autocxx-generated classes.
Now I'm able to generate other classes which have a few methods that accept/return myns::String. The other methods which accept/return "innocent" types are autogenerated properly, and I can manually bind to the myns::String-related methods using the cpp crate.
So, this is no longer urgent for me, although it would be nice it were fixed.
P.s. thank you Adrian for all of your hard work on this crate, in most cases it works spectacularly well and has saved me a ton of work!
Thanks - I'm glad you found a workaround.
As I mentioned, the naming stuff is annoyingly fragile, and unfortunately I think providing custom names for stuff falls into the category of "stuff I'm not quite brave enough to touch" until more work is done to clean it up - specifically a bunch of newtype wrappers for the different kinds of name we hold, which is roughly the thing tracked in #520.
I've now dug into #520 enough to refresh my memory of the naming stuff, and what it would take to fix this.
I'm not likely to tackle this myself, but here's what it would take. Each of these steps can be a separate PR. This would improve the maintainability of the codebase so I'm in favour of it; it's just not near the top of my priority list for the limited time I can personally spend on autocxx.
- A lot of ground work I already did in a couple of PRs on #520.
- Each
Apiinside the autocxx engine has aQualifiedNameassociated. This is used for two things: first, to contain the actual name of the struct/function/etc. which is passed to cxx, and used in generated Rust code, and used in generated C++ code. Secondly, it's used to indicate dependency relationships between APIs (e.g. function X has a parameter of type Y). We need to split those usages, because fixing this bug is going to require renaming structs. The first step is to cease to use the name when doing codegen for functions. Any usages of the API name forApi::Functionwithincodegen_rsorcodegen_cppneed to be replaced with new fields onFnAnalysis. - Similarly, we shouldn't rely on the existing name of any other type of API during codegen. Replace the QualifiedName with new fields which indicate the name to be used during codegen. At this point, take the opportunity to introduce new newtype wrappers for the different kinds of name; e.g. a
CxxBridgeNametype for any name that is to be used in thecxx::bridge. - Retrofit those new newtype wrappers into the
FnAnalysis. - We can now replace the
QualifiedNamewith a numeric ID, to track the dependency relationships between APIs. - We can now set about renaming APIs to avoid conflicts. There's already a lot of code like this for functions - due to overloads. The function analysis stage already separately tracks the
cxx::bridgename relative to the original C++ name and Rust name. We need to extract all that so that it applies to all types of API. We should ideally do this on the creation of anyCxxBridgeNameso that the existence of such a name guarantees uniqueness. Most APIs will need a separateCxxBridgeName,RustNameandCppName. It might be desirable to lump them together into a newNameInfostructure which can be repeated across all APIs. This is a little bit like the oldQualifiedName. As far as I can tell, onlyCxxBridgeNamewill need to be uniquified, because that's the only space of names which is flat. Rust names and C++ names are already in hierarchic structures of namespaces and we can reasonably assume that the C++ code, and bindgen, have respectively ensured uniqueness there. - Creation of a
CxxBridgeNameshould also pass the checks invalidate_ident_ok_for_cxx, which should be expanded to reject things likeString. - We may need to introduce a new analysis phase to do some renaming. For example if a function B depends on a parameter of type A, and we've renamed A, we will need to rewrite the signature of function B within the
cxx::bridge. However I'm moderately hopeful that all uniquifiication ofCxxBridgeNameswill happen early enough that this will be taken care of by small changes withinTypeConverter.
This is all pretty major surgery but I have a reasonable amount of faith that the hundreds of integration tests will catch most of the corner cases, so it should be achievable.
Meanwhile it's quite possible that I'll raise a PR to simply reject APIs called String etc.
The PR to reject these names didn't turn out simple. More significant progress on #520 is required here.