libffi-rs icon indicating copy to clipboard operation
libffi-rs copied to clipboard

`high` interface soundness

Open Monadic-Cat opened this issue 3 years ago • 19 comments

These are the soundness issues I'm aware of in the high interface:

  • [ ] It erases all auto trait information attached to closure captures (read: Send and Sync)
  • [x] It erases mutability required to call closures (read: ClosureMutN can be called by shared reference)

There may be more that I'm not aware of, though the pattern seems to be that the high interface erases too much information to be able to effectively lean on the compiler checking things for us.

I'd like to work on fixing these, though I'm having trouble forming a solid idea of how to fix the auto trait problem.

Monadic-Cat avatar Nov 07 '21 21:11 Monadic-Cat

@xacrimon I received your Email about the soundness issue, but hadn't had time to reply yet. Since this issue exists here now, let's keep the conversation here. That way the same things won't be discussed in two different places :smile:

@Monadic-Cat You're right the API has some soundness issues. We had them in the past as well (just noticed you also reported that one).

The challenge here is that I don't use the high or medium level APIs myself, nor do I know how many others use them. Personally I really want to get rid of them, as maintaining the low-level API is hard enough already. Now the low-level API might have its own issues, but at least the code is largely unsafe, so it's to be expected.

Which brings me to the following: I'm not sure how to fix this in the high-level API. I also don't have the time for this in the coming weeks unfortunately. So if anybody can come up with any fixes that would be great. With that said, if this is going to require extensive changes I'd much rather mark the high and mid APIs as deprecated and remove them entirely in a new major version.

yorickpeterse avatar Nov 07 '21 23:11 yorickpeterse

To add to that: my rough long term idea would be to nuke the high/mid APIs, make "low" the only API, then gradually try to make the APIs safe where it makes sense. 100% safety probably never will be achieved due to how libffi works, but there are probably places where we can make improvements.

yorickpeterse avatar Nov 07 '21 23:11 yorickpeterse

@ry Which API are you folks using for Deno? If this isn't the low-level API I'd like to know more about this, before I toss away something Deno depends on :smiley:

@timfish Which API are you using? I think you mentioned using the low-level API in the past, but I might be mixing up GitHub users.

yorickpeterse avatar Nov 07 '21 23:11 yorickpeterse

I used to use libffi::high but not now!

timfish avatar Nov 08 '21 11:11 timfish

@YorickPeterse Great! When we do land fixes, we should probably yank the affected versions as per crates.io recommendations. I cannot get fixes done at the moment however. I'll go file an advisory for this.

xacrimon avatar Nov 08 '21 15:11 xacrimon

A quick search of GitHub shows these public usages of libffi::high: https://github.com/manyuanrong/deno-ffi https://github.com/sisungo/euola-vm https://github.com/mofeing/ossia-rs https://github.com/tungli/lsode-rust

And these use libffi::middle: https://github.com/rsmpi/rsmpi https://github.com/denoland/deno https://github.com/RustPython/RustPython https://github.com/rustrat/rustrat

timfish avatar Nov 08 '21 15:11 timfish

@YorickPeterse Great! When we do land fixes, we should probably yank the affected versions as per crates.io recommendations. I cannot get fixes done at the moment however. I'll go file an advisory for this.

@xacrimon I won't be yanking anything, because:

  1. It may break builds for people using said versions
  2. We'd basically have to yank everything, since all versions are affected

Given the risk factor being small, it simply isn't worth it.

yorickpeterse avatar Nov 08 '21 19:11 yorickpeterse

@YorickPeterse Fair, your call. That’s said yanking shouldn’t break existing builds.

xacrimon avatar Nov 08 '21 19:11 xacrimon

@YorickPeterse Fair, your call. That’s said yanking shouldn’t break existing builds.

That's only the case for projects keeping their Cargo.lock under version control. Libary projects (e.g. like libffi-rs itself) typically don't track Cargo.lock, and will likely start running into trouble the moment you yank a crate.

yorickpeterse avatar Nov 09 '21 01:11 yorickpeterse

You could release a new semver-compatible release for all the to-be-yanked / unsound releases out there, with a #![deprecated…] annotation on it nudging users to upgrade, which, coupled with the cargo audit-visible advisory, would give users a loud enough warning and time for them to upgrade (only those without cargo-audit and with a Cargo.lock would be oblivious to this, but then those wouldn't be affected by the yanking). Then, after a few months, you could finally yank all of them.

danielhenrymantilla avatar Nov 09 '21 10:11 danielhenrymantilla

As a first step, may I suggest documenting these soundness problems in a prominent place?

tov avatar Dec 01 '21 22:12 tov

Also, sorry about the soundness problems. Eek!

tov avatar Dec 01 '21 22:12 tov

From what I can tell, the ClosureMutN problem can be solved by making ClosureMutN::code_ptr() take &mut self.

The Send/Sync problem looks harder, but I have a few ideas.

tov avatar Dec 03 '21 18:12 tov

Actually, I’m a bit confused by

It erases all auto trait information attached to closure captures (read: Send and Sync)

because currently the Closure…N types are !Send and !Sync. Unless I misunderstand, this means that it isn’t a soundness problem but expressiveness problem: We can’t make Send and/or Sync closures. Am I missing something?

Update: Is this about the Send/Syncness of FnPtrN, not of closures?

tov avatar Dec 03 '21 18:12 tov

Yep, FnPtrN is not !Sync, and yet one can get a &FnPtrN from a &ClosureN, which seems fishy / likely UB (unless some external synchronization primitive comes into play: I haven't looked at the implementation, just the signatures):

impl Fn() + !Sync // parallel calls may cause UB
  -> Closure0
  --take a ref--> &Closure0
  --.code_ptr()--> &FnPtr0
  --ref can be sent across threads--> call it in parallel

And similarly, for an impl FnMut(), the & -> &FnPtr3 which is callable is the part allowing concurrent calls to something that only promised to be soundly callable in a sequential manner.

danielhenrymantilla avatar Dec 04 '21 00:12 danielhenrymantilla

Yep, FnPtrN is not !Sync, and yet one can get a &FnPtrN from a &ClosureN, which seems fishy / likely UB

I’m wondering whether the auto traits can be threaded through from the original Fn* to the FnPtrN as a phantom parameter.

unless some external synchronization primitive comes into play: I haven't looked at the implementation, just the signatures

No external synchronization afaik.

And similarly, for an impl FnMut(), the & -> &FnPtr3 which is callable is the part allowing concurrent calls to something that only promised to be soundly callable in a sequential manner.

I believe that changing that code_ptr() method to take &mut self will solve this. Am I missing something?

tov avatar Dec 05 '21 20:12 tov

Regarding threading the auto-traits, feel free to look at https://docs.rs/stackbox/latest/src/stackbox/dyn_traits/any.rs.html#31-40 for a not-so-far example;

Once that's done, the auto-trait issue will be fixed, and indeed if code_ptr() takes a &mut it would then be fine. Was there something else, @Monadic-Cat?

danielhenrymantilla avatar Dec 06 '21 00:12 danielhenrymantilla

Once that's done, the auto-trait issue will be fixed, and indeed if code_ptr() takes a &mut it would then be fine. Was there something else, @Monadic-Cat?

That's all I'm aware of, but I don't have high confidence that that's all. (I'm the one who missed these problems just long enough to not fix them in #27, after all.)

Monadic-Cat avatar Dec 06 '21 16:12 Monadic-Cat

So, I believe #40 fixes the mutability erasure problem. I think it should be merged, but I’d like a second opinion first.

The auto trait problem is tougher. #42 makes FnPtrN !Send and !Sync, which I believe fixes the soundness problem but severely limits expressiveness. It should be equally straightforward to make everything (FnPtrN and Closure*N) Send + Sync, if we require closures passed to Closure*N::new to be Send + Sync. But that’s not a great solution for the same reason.

Two possibilities:

  • Split each type into four types, one for each combination of Send and Sync. So for example, ClosureMut2 would turn into ClosureMut2, SendClosureMut2, SyncClosureMut2, and SendSyncClosureMut2. That’s pretty unwieldy, though, and it isn’t complete, because we might also want to handle UnwindSafe and RefUnwindSafe, which means splitting 16 ways.

  • Add a (defaulted) type parameter to FnPtrN and Closure*N that would carry auto trait information. I’ve been playing with this to see if I can make it 1) work and 2) reasonable.

Any thoughts?

tov avatar Dec 22 '21 20:12 tov