encoding_c
encoding_c copied to clipboard
The define ENCODING_RS_ENCODING is unusual way of wrapping
Noticed that this library does some tricks that are unidiomatic for C++. I'll post a minimal example.
// library.h
#ifdef __cplusplus
#define ENCODING_RS_ENCODING CppEncoding
class CppEncoding {
//no data members
public:
CppEncoding() = delete; // this object can not be created by value
};
#else
#define ENCODING_RS_ENCODING struct CEncoding
struct CEncoding;
#endif
ENCODING_RS_ENCODING * library_function(const char * name);
The effect is that if library_function()
is called from C code it returns pointer to CEncoding
, and from C++ code it returns pointer to CppEncoding
. This is unusual and it gets very close to the land of undefined behaviour (breaking strict aliasing), although probably is not as long as you don't dereference those pointers.
The greater concern to me is that the default C++ wrapper exposes pointer semantics, and not value semantics. The way I would write the C++ wrapper is the usual way by wrapping the (pointer to) C-struct inside the C++ class.
// C part
struct CEncoding;
struct CEncoding * library_function(const char * name);
// C++ part
class CppEncoding {
struct CEncoding * enc = nullptr;
public:
CppEncoding() = default;
CppEncoding(const char * name) { ... }
};
Why was this done the way it is? Maybe to avoid name clashes? Would you consider to update to the idiomatic C++?
Yeah, this gets messy if the same app 1) uses it both from C and C++ and 2) it passes what's returned from Rust to C further from C to C++.
although probably is not as long as you don't dereference those pointers
The dereferences happen on the Rust side, so the dereferencing isn't a problem.
The greater concern to me is that the default C++ wrapper exposes pointer semantics, and not value semantics.
Exposing value semantics without actually wrapping a pointer would require passing size and alignment information from the Rust compiler to the C++ compiler, which would mean using a nightly-only Rust feature and parsing its output in the C++ build system.
Exposing value semantics in the way you suggest seems like making CppEncoding
a special-purpose unique_ptr
for CEncoding
.
Why was this done the way it is?
It was done this way to allow the pointer to be a pointer to a namespaced thing in C++ in a C++ codebase 1) that doesn't care about also using the raw C API directly and 2) doesn't treat heap-allocated objects pointed to by unique pointer as unidiomatic and to allow the pointer to be directly a pointer to the underlying Rust object.
I'd like to understand more about your use case that ends up wishing to call the API both via the raw C API and the C++ wrapper.
Exposing value semantics without actually wrapping a pointer would require passing size and alignment information from the Rust compiler to the C++ compiler, which would mean using a nightly-only Rust feature and parsing its output in the C++ build system.
I understand, but I don't require that.
Exposing value semantics in the way you suggest seems like making
CppEncoding
a special-purposeunique_ptr
forCEncoding
.
You may say that, it is a good point of view, but I would say that that CppEncoding is a proper C++ class that follows the RAII pattern, like any other class in the C++ standard library, e.g. std::string
and std::vector
. Such classes are expected by C++ developers. In my viewpoint I wouldn't mention unique_ptr
. Instead I'm mentioning RAII because that exists before unique_ptr
exists. Thus for me vector
is a better example than unique_ptr
.
Now, the struct/class Encoding does not need RAII because it doesn't need to be destroyed (from C/C++ side), but Encoder
and Decoder
need destroying. I'll use Decoder
as my example.
To achieve RAII with the Decoder
you jump through hoops and do confusing acrobatics. You have:
- Deleted constructor of Decoder
- Empty destructor of Decoder
- Overloaded operator delete where you do the destruction!!!
- Force the client to use the Decoder pointer via std::unique_ptr.
That way, you still can circumvent RAII by calling unique_ptr::release()
.
Oops. Sorry, my previous reply was bogus. Let's try again. In my previous reply, I what thinking of Encoder
and Decoder
, which are returned by value in Rust and are inside a unique_ptr
in C++.
The reason why Encoding
is behind a plain pointer is that it is statically allocated and must not be copied or destroyed, but I still want it to act is if it was a C++ class.
For Decoder
/Encoder
, the key point is that unless the C++ build system obtains size and alignment information from the Rust compiler such that C++ code actually ends up moving Rust-initialized Decoder
/Encoder
by value, Decoder
and Encoder
need to be allocated on the heap on the Rust side in order not to require size/alignment information on the C++ side.
Once they are heap-allocated, wrapping them in C++ such that the wrappers would be smart pointers in disguise would be a lie. Letting these look like heap-allocated objects managed by std::unique_ptr
is honest about the point of these being heap-allocated.
That way, you still can circumvent RAII by calling
unique_ptr::release()
.
In C++, you can always circumvent things if you try hard enough.