cxx icon indicating copy to clipboard operation
cxx copied to clipboard

Add string_view API for c++17

Open roligugus opened this issue 5 years ago • 5 comments

Follow-up of WIP: https://github.com/dtolnay/cxx/pull/80 to add std::string_view support.

std::string_view is enabled if cxx is compiled with c++17 or later. One will need to set the c++17 (or later) feature on the cxx crate for cargo-based builds. Also added a "CXX Features" section to the cargo build info if that helps.

Notes:

  • This only replaces the std::string ctor, not the ptr and ptr/len ctors as in the WIP
    • We could replace the latter ctors as well, but it can break client code. It would require changes for rust::str and rust::String parameters and return values in the client code. Not sure you'd want to break the API.
  • Tests are only enabled for c++17 (or later)
    • One needs to set the c++17 (or later) feature on cxx in tests/ffi/Cargo.toml at this time. Not ideal, but a start.
  • Added a "CXX Features" section to the documentation if that helps

roligugus avatar Nov 03 '20 19:11 roligugus

That will also provide a better place to document the features situation -- I think it's not good for a readme to go into this level of detail.

No pressure to look at this. Just had some downtime at work to quickly run after this.

Rebased changes and I've moved the "features" documentation into the book/build/cargo.md. Not sure if you want to have it there or somewhere else. Just a placeholder, really.

roligugus avatar Nov 24 '20 20:11 roligugus

I am concerned that in this implementation, simply adding a new dependency to a Cargo build graph can cause existing working code to fail to link. If the newly added crates somewhere contain a dependency on "cxx/c++17", that leads to removing various symbols from the cxx rlib without doing anything to eliminate use of those symbols in elsewhere in the build graph.

Yes, that is a downside of tying that functionality to a C++ standard switch and automatically enabling it in the header if it's available while at the same time removing existing functionality. The nice thing is that you get link errors due to missing symbols that your particular client compilation requires.

UPDATE: Removed paragraph as I think I understood wrongly what you meant. The other stuff here still applies, imho.

The current approach would not work for these cases as it breaks the cxx lib ABI as well as the API.

In order to always be able to handle this case, one must never remove current functionality to be backwards compatible from an ABI point-of-view. One could be somewhat flexible with APIs depending on the change - more on that later. Yeah, there's also the "let's version APIs and lib" approach that one would do in C++/C land (not sure about rust), but I don't think that worth considering as that's always a pain and I typically like avoiding that like the pest.

Possible approach to follow...

roligugus avatar Nov 26 '20 23:11 roligugus

Relatedly it seems like this implementation violates ODR which is UB (https://en.cppreference.com/w/cpp/language/definition#One_Definition_Rule), because the class definitions of Str and String don't necessarily "consists of the same sequence of tokens" each place they appear.

Don't think this is an ODR problem in this instance. The cxx rlib itself is fine if compiled e.g. with C++17. No ODR there. Your client code is also fine even though your included cxx header might declare different things. No ODR there. However, you will run into link errors when trying to link the cxx rlib into your client lib/exec conveniently avoiding ODR.

It only would become an ODR issue if you were compiling cxx and your client TUs directly into the same executable or lib without going through a separate cxx lib. However, in that case you will be using your own build system and you better be compiling everything linked together into the same lib/exec using the same compilation settings.

roligugus avatar Nov 26 '20 23:11 roligugus

Thoughts? It seems like more design work is still required on how to integrate this from the build side.

Agree. So what can we do to avoid that?

The problem is that this PR breaks the API and the ABI by adding a new string_view ctor API (with implementation) and removing another string ctor API (with implementation). In addition it does this silently depending on how the client using cxx is compiled. Such an API and ABI change is only workable if you control all code and are willing to set the same compilation options used to toggle APIs in cxx.

A way of handling that is not to automatically enable the string_view APIs Using a #define to enable the string_view API on demand while using it in your client code has the same issue as automatically enabling it upon a certain c++ standard. Different clients might have different defines resulting in the same issues we try to avoid.

The only way to handle that would be to do something similar to what e.g. autotools do. The idea is to bake the functionality into cxx. E.g. have the cxx crate write a config.h equivalent that #defines what features are enabled during the cxx lib build and then have cxx.h include that generated header. This way, the functionality that cxx supports (defined by its features) is defined by cxx itself and not by the dependencies using it.

However, you still would run into the same issue for breaking changes where client A needs cxx features X and client B needs cxx feature Y. So that would not be solved as the cxx lib compilation would define the supported functionality depending on the features defined.

A better way to fix that for this change is to "only add new APIs". Essentially chosing the approach of not breaking the cxx rlib ABI. APIs still can be changed in ways that don't break the existing ABI. In this case, we can add the string_view APIs, while keeping the string APIs around.

Since we're currently keeping the char*-parameter constructors, we should be able to simply add the string_view-parameter constructor a la:

#ifdef CXXBRIDGE_HAS_STRING_VIEW
  String(std::string_view);
#endif
  String(const std::string &);
  String(const char *);
  String(const char *, size_t);

Ditto with the conversion operator.

This will keep the ABI backwards-compatible, i.e. client code compiled with e.g. c++14 will be able to compile and link against a cxx rlib compiled with c++17. This allows to compile the cxx lib with whatever highest c++ standard required to support both c++14 (using string ctor) clients and c++17 clients (using string_view ctor).

We can go further than that. By inlining the new string_view APIs in the cxx.h header, we can even support the "other direction", i.e. C++17 clients using the string_view APIs compiled against a cxx lib compiled with c++14. The trick is in the inlined "C++17 string_view API" code that essentially gets injected into the client compilation via cxx.h due to inlining and internally only calling existing C++14/11 cxx APIs.

We can keep the automatic enabling of the string_view ctor and conversion operators to support string_view APIs.

Sounds like a better approach? I can update this PR to give a better picture.

roligugus avatar Nov 27 '20 00:11 roligugus

Hi there! I went looking for this feature earlier today, seems like it's stalled out. Is there any interest in revisiting this?

hallfox avatar Jul 08 '23 02:07 hallfox