autocxx Improve 'unsafe' for optimal code review

Versions of cxx prior to 1.0 and versions of autocxx prior to 0.5 do not require an unsafe keyword. That policy changes with 1.0 and 0.5 respectively, such that at least one unsafe keyword is required to indicate that all C++ interop is inherently unsafe. It is. Mistakes in C++ can and will result in UB elsewhere in Rust code.

I've not yet made an autocxx 0.5 release with this policy change, because I want to achieve more goals with respect to the unsafe keyword. These goals only make sense in the context of a significant existing C++ codebase which aims to adopt small amounts of Rust.

Goals

Make unsafe Rust code scarier than FFI. When code reviewing a Rust change, reviewers should be able to distinguish unsafe pure-Rust code (which is a truly special situation and requires the highest level of code review scrutiny) from function calls into existing C++ (which is unsafe from a UB perspective, but pragmatically requires only similar code review to existing C++ code changes). Specifically, I'm very afraid that "real" unsafe blocks - which will likely be extremely rare - will be overlooked amongst the thousands of C++ FFI unsafe blocks. unsafe will lose all meaning at code review time. We can't have that.
(related) Allow lints for unsafe which don't trigger on FFI. For the same reason. Maybe some organizations want all unsafe blocks to get additional sign-off from two senior engineers, but they want to allow free use of existing C++ APIs without those kinds of costs. At the very least, these policies would (rightly) discourage use of unsafe but we still need to allow C++ FFI in existing codebases.
Distinguish scarier C++ constructs. If a std::unique_ptr<T> passes from C++ to Rust, it's theoretically possible that another pointer exists in C++ and therefore there could be concurrent mutation, race conditions, buffer overflows, etc. But if a std::shared_ptr<T> passes from C++ to Rust, it's very likely that these mistakes have been made. It feels that pragmatically some C++ FFI constructs are much scarier than others.

Design

I propose:

Like autocxx, the unsafe keyword becomes optional, but if it's omitted then all the actual functions become unsafe.
The unsafe keyword currently required by (head) autocxx can optionally instead be spelled unsafe_ffi, with no functional difference. The intention here is that, at code review, if a reviewer finds a genuine unsafe keyword then they give deep scrutiny (and maybe the committer needs to ask for special approval). If they encounter unsafe_ffi then they know that this is "only" C++ FFI, and whilst it absolutely could cause action-at-a-distance UB anywhere in the Rust code, it's no scarier than the existing C++ code.
The unsafe or unsafe_ffi keywords can optionally take a policy which state that some function calls remain unsafe. Users might choose between these policies:
- All C++ FFI calls are unsafe. (Equivalent to omitting the unsafe/unsafe_ffi keyword)
- All C++ FFI calls are unsafe, except those passing only primitives (chars, ints etc.)
- All C++ FFI calls are unsafe, except those passing only primitives and UniquePtr. (And maybe other types which typically imply sole ownership of a C++ thing).
- No C++ FFI calls are unsafe (the current behavior).

Syntax

I'm not sure how best syntactically to specify this policy. There are not really any good options:

unsafe(primitives,UniquePtr)

unsafe=primitives,UniquePtr

etc. Also, the specified list is probably those things which are deemed safe rather than unsafe, so it's all very confusing. Thoughts welcome.

Timing

I plan to achieve step 1 and 2 before releasing 0.5. This issue will remain open for the more difficult job of figuring out step 3.

Jan 05 '21 00:01 adetaylor

WIP here: https://github.com/google/autocxx/compare/unsafe_ffi

It isn't any good right now because the macro_rules! wrapper can't cope with unsafe being optional. I need to work on that.

Jan 05 '21 01:01 adetaylor

I was a bit overambitious even trying to achieve point 1 of the design.

macro_rules! don't implement lookahead, so Rust can't work out whether unsafe is a keyword or the beginning of a directive!(...).

So the PR above instead requires a safety!(...) directive which can contain the unsafe keyword. I'm not entirely happy about this because it's nastily verbose. I will think about it...

Jan 05 '21 16:01 adetaylor

autocxx autocxx copied to clipboard

Improve 'unsafe' for optimal code review

Goals

Design

Syntax

Timing

autocxx
autocxx copied to clipboard