autocxx
autocxx copied to clipboard
Improve 'unsafe' for optimal code review
Versions of cxx prior to 1.0 and versions of autocxx prior to 0.5 do not require an unsafe
keyword. That policy changes with 1.0 and 0.5 respectively, such that at least one unsafe
keyword is required to indicate that all C++ interop is inherently unsafe. It is. Mistakes in C++ can and will result in UB elsewhere in Rust code.
I've not yet made an autocxx 0.5 release with this policy change, because I want to achieve more goals with respect to the unsafe
keyword. These goals only make sense in the context of a significant existing C++ codebase which aims to adopt small amounts of Rust.
Goals
-
Make
unsafe
Rust code scarier than FFI. When code reviewing a Rust change, reviewers should be able to distinguishunsafe
pure-Rust code (which is a truly special situation and requires the highest level of code review scrutiny) from function calls into existing C++ (which isunsafe
from a UB perspective, but pragmatically requires only similar code review to existing C++ code changes). Specifically, I'm very afraid that "real"unsafe
blocks - which will likely be extremely rare - will be overlooked amongst the thousands of C++ FFIunsafe
blocks.unsafe
will lose all meaning at code review time. We can't have that. - (related) Allow lints for
unsafe
which don't trigger on FFI. For the same reason. Maybe some organizations want allunsafe
blocks to get additional sign-off from two senior engineers, but they want to allow free use of existing C++ APIs without those kinds of costs. At the very least, these policies would (rightly) discourage use ofunsafe
but we still need to allow C++ FFI in existing codebases. -
Distinguish scarier C++ constructs. If a
std::unique_ptr<T>
passes from C++ to Rust, it's theoretically possible that another pointer exists in C++ and therefore there could be concurrent mutation, race conditions, buffer overflows, etc. But if astd::shared_ptr<T>
passes from C++ to Rust, it's very likely that these mistakes have been made. It feels that pragmatically some C++ FFI constructs are much scarier than others.
Design
I propose:
- Like
autocxx
, theunsafe
keyword becomes optional, but if it's omitted then all the actual functions becomeunsafe
. - The
unsafe
keyword currently required by (head)autocxx
can optionally instead be spelledunsafe_ffi
, with no functional difference. The intention here is that, at code review, if a reviewer finds a genuineunsafe
keyword then they give deep scrutiny (and maybe the committer needs to ask for special approval). If they encounterunsafe_ffi
then they know that this is "only" C++ FFI, and whilst it absolutely could cause action-at-a-distance UB anywhere in the Rust code, it's no scarier than the existing C++ code. - The
unsafe
orunsafe_ffi
keywords can optionally take a policy which state that some function calls remainunsafe
. Users might choose between these policies:- All C++ FFI calls are unsafe. (Equivalent to omitting the
unsafe
/unsafe_ffi
keyword) - All C++ FFI calls are unsafe, except those passing only primitives (chars, ints etc.)
- All C++ FFI calls are unsafe, except those passing only primitives and
UniquePtr
. (And maybe other types which typically imply sole ownership of a C++ thing). - No C++ FFI calls are unsafe (the current behavior).
- All C++ FFI calls are unsafe. (Equivalent to omitting the
Syntax
I'm not sure how best syntactically to specify this policy. There are not really any good options:
unsafe(primitives,UniquePtr)
unsafe=primitives,UniquePtr
etc. Also, the specified list is probably those things which are deemed safe rather than unsafe, so it's all very confusing. Thoughts welcome.
Timing
I plan to achieve step 1 and 2 before releasing 0.5. This issue will remain open for the more difficult job of figuring out step 3.
WIP here: https://github.com/google/autocxx/compare/unsafe_ffi
It isn't any good right now because the macro_rules!
wrapper can't cope with unsafe
being optional. I need to work on that.
I was a bit overambitious even trying to achieve point 1 of the design.
macro_rules!
don't implement lookahead, so Rust can't work out whether unsafe
is a keyword or the beginning of a directive!(...)
.
So the PR above instead requires a safety!(...)
directive which can contain the unsafe
keyword. I'm not entirely happy about this because it's nastily verbose. I will think about it...