wg icon indicating copy to clipboard operation
wg copied to clipboard

Tracking Issue for Mutex Trait RFC [#377]

Open korken89 opened this issue 5 years ago • 36 comments

This is the tracking issue for the Mutex trait RFC #377, the following questions need to be ironed out before the RFC is implemented:

    • [x] Decide on crate name (core-mutex was proposed, mutex-trait was selected)
    • [x] Decide on crate location (freestanding crate under the HAL team was proposed)
    • [x] Add and implement crate (proposed reference implementation: https://github.com/korken89/core-mutex)
    • [x] Release crate
    • [ ] Create PRs for the ecosystem
      • [ ] Identify where the crate needs to be re-exported
    • [ ] Add a book chapter on it?
    • [ ] (optional) Provide reference implementations for different mutexes
    • [ ] (optional) Add to the 1.0 stabilization pipeline

Lets start with 1 and 2.

CC @rust-embedded/all

korken89 avatar Nov 20 '19 13:11 korken89

As for the name, I suggest mutex-trait or mutex_trait (like stable_deref_trait)

Disasm avatar Nov 20 '19 16:11 Disasm

@Disasm Thanks!


I think we should have a time period for proposing names and then we go for a vote. Maybe 2 weeks now for proposing names, and 1 week to vote after that. What does the rest of @rust-embedded/all think?

korken89 avatar Nov 21 '19 08:11 korken89

Having already had a full RFC for the concept I don't think we really need a vote on the name; I'd be happy for you to go ahead with whatever you think is sensible.

adamgreig avatar Nov 21 '19 14:11 adamgreig

I also don't think we need a vote. It's bikeshedding really...

therealprof avatar Nov 21 '19 14:11 therealprof

I agree with @adamgreig and @therealprof. We shouldn't watershed the name choice, it's not that important.

almindor avatar Nov 21 '19 16:11 almindor

As an outside view: given that

  • in the RFC discussion and chat there has been confusion of this being an implementation, and
  • this being called the "Mutex Trait RFC"

calling it mutex-trait sounds very reasonable :)

nickray avatar Nov 21 '19 17:11 nickray

Hi @korken89 Link 'this part of the RTFM book' is broken.

geomatsi avatar Nov 23 '19 20:11 geomatsi

Thanks for the input, let's have a small discussion (yay/nay) on mutex-trait and use that. :)

@geomasi, I'll fix it now.

korken89 avatar Nov 26 '19 18:11 korken89

In today's WG meeting we decided to go with mutex-trait, this will not be set in stone until the first release but lets have it as the current name.

korken89 avatar Nov 26 '19 19:11 korken89

Lets continue, @rust-embedded/hal do you agree that this fits under HAL or would you propose another group?

korken89 avatar Nov 29 '19 08:11 korken89

@korken89 HAL is maybe not ideal, how do we feel about @rust-embedded/resources ?

therealprof avatar Nov 29 '19 08:11 therealprof

IMO HAL team (focused on abstractions and code portability) makes more sense than Resources (focused on docs, PR, and guides).

jamesmunns avatar Nov 29 '19 15:11 jamesmunns

I don't quite see a Mutex trait as a Hardware abstraction but more of a implementation convention which seems closer to a resource. But hey, just throwing it out there, I'm not too attached to it. ;)

therealprof avatar Nov 29 '19 15:11 therealprof

You guys are using a different meaning of resource here, where the "resources" that James speaks of is what the WG team refers to ;)

Theres lots of bikeshedding potential here. IMO a mutex is not really abstracting over hardware. You can implement it with intermediary crates that already abstract over e.g. interrupt masking and atomic operations.

What we have here is a classical SW interface, like for example pthreads. First sentence of phtreads manpage:

POSIX.1 specifies a set of interfaces (functions, header files) for threaded programming commonly known as POSIX threads, or Pthreads.

So following the pthreads example, what we are cooking up here is an API.

Edit: On second thought, the A in API does not make much sense if the trait is used inside a kernel for example, because application doesn't make too much sense there. Still think we are talking about interface, not abstraction or resource here.

andre-richter avatar Nov 29 '19 17:11 andre-richter

You guys are using a different meaning of resource here, where the "resources" that James speaks of is what the WG team refers to ;)

I'm certainly aware of the role of the team, especially since I happen to be a member... ;)

Still think we are talking about interface, not abstraction or resource here.

As James said, the role of the team is managing documentation and guidelines and to me the mutex traits feels a lot closer to a guideline for writing embedded software in Rust than hardware abstraction.

therealprof avatar Nov 29 '19 18:11 therealprof

It feels like both HAL and Resources do not match perfectly with the purpose of this mutex crate. Since

  1. we do not want to create another Team for this project
  2. @korken89 is already in the Resources team, but not in the HAL team

I suggest placing this crate inside the Resources team, just to save time required for this to happen.

Disasm avatar Nov 30 '19 12:11 Disasm

Resources seems like a good placement, but HAL team should be closely involved because they are the ones that can provide the best "hands on" requirements.

almindor avatar Nov 30 '19 17:11 almindor

I was just made aware of this RFC and was quite surprised by parts of it. I hope this is an appropriate channel to ask such questions -- if not, please direct me elsewhere. :)

In particular, I am quite surprised by the argument of lock being &mut. According to my understanding of mutable references, that makes lock basically useless because it requires exclusive access to even call lock, at which point there cannot be any races anyway. You later show that the intent is to implement the Mutex API for a Copy type, so things are not actually exclusive, but that still leaves me puzzled.

  1. For example, I think with this interface it is impossible to write a generic piece of code that takes a Mutex, spawns two threads, and uses the Mutex in both of them -- something like the following libstd+rayon code:
fn bump_concurrently(x: &Mutex<i32>) {
  rayon::join(
    || *x.lock().unwrap() += 1,
    || *x.lock().unwrap() += 1,
  );
}

Is that correct?

  1. The RFC then shows that the Mutex trait is implemented for shared references, meaning the &mut Self is actually &mut &Mutex<...>. But then, why does that not entirely break the deadlock prevention scheme? Shouldn't the following work? (Code adjusted from the RFC)
static MUT: Mutex<RefCell<i32>> = Mutex::new(RefCell::new(0));

#[entry]
fn init() -> ! {
    let mut r = &MUT; // Note that r is mutable

    r.lock(|data| {
        let mut r2 = &MUT;
        r2.lock(|data| {/* oops, acquired the same lock twice. deadlock? */} );
    });
}

It seems deadlocks are only avoided for generic code that doesn't know that the type implementing core_mutex::Mutex is actually Copy. But there will always be some code that has to know the concrete type and hands out mutable references to all concurrently running operations.

  1. And finally, even in generic code, there could be deadlocks due to lock ordering issues:
fn nest(
  mtx1: &mut impl core_mutex::Mutex<Data = i32>,
  mtx2: &mut impl core_mutex::Mutex<Data = i32>,
) {
  mtx1.lock(|data| mtx2.lock(|data| {}) )
}

// And now concurrently call `nest(mtx1, mtx2)` and `nest(mtx2, mtx1)`.

So, "deadlock-safe by construction" seems to be a best-effort property, but not a full guarantee? (With "by construction" I am expecting guarantees, like Rust code being "memory safe by construction".)

RalfJung avatar Dec 07 '19 10:12 RalfJung

For example, I think with this interface it is impossible to write a generic piece of code that takes a Mutex, spawns two threads, and uses the Mutex in both of them -- something like the following libstd+rayon code: [...] Is that correct?

Yes. This is a byproduct of the decisions which lead to this mutex design. For the full history, take a look at my original MR to embedded-hal and the RFC MR.

To summarize, the mutex design tries to abstract over a very broad range of 'things' which could be considered a mutex. This includes the more traditional mutex types (like the one in std) which is what you expected to find. But it also includes RTFM's mutex type which has entirely different semantics (can't be copied or moved across threads).

I tried to describe the differences in more detail in this comment.

But then, why does that not entirely break the deadlock prevention scheme?

It does. The deadlock prevention only works as long as the types are not Copy, as you noted. To battle this, I suggested adding a newtype wrapper (MutexRef) for this purpose (see this comment).

But IMO, the trait design does not help with deadlock prevention too much, it just needs to be this way to allow implementation for types like RTFM's mutex which do deadlock prevention.

And finally, even in generic code, there could be deadlocks due to lock ordering issues: [...] So, "deadlock-safe by construction" seems to be a best-effort property, but not a full guarantee?

Yes. As with your previous point, I don't think we gain deadlock prevention just from the mutex design, but from certain implementations and the traits need to be compatible with those ...

Rahix avatar Dec 07 '19 11:12 Rahix

But IMO, the trait design does not help with deadlock prevention too much, it just needs to be this way to allow implementation for types like RTFM's mutex which do deadlock prevention.

Ah, that makes a lot of sense! Thanks for explaining that.

RalfJung avatar Dec 09 '19 20:12 RalfJung

So, resources team for now then seems to be the consensus :)

korken89 avatar Dec 10 '19 18:12 korken89

@korken89 So how do we get this repository started? Do you want me to create one?

therealprof avatar Dec 11 '19 09:12 therealprof

@therealprof That would be great! I will then add implementation and then we can start reviewing it.

korken89 avatar Dec 11 '19 10:12 korken89

https://github.com/rust-embedded/mutex-trait

And I assume you hit the close button by accident?

therealprof avatar Dec 11 '19 10:12 therealprof

Ops, thanks for fixing!

korken89 avatar Dec 11 '19 11:12 korken89

I have added implementation and fixed the repo settings: https://github.com/rust-embedded/mutex-trait Can people have a look so I did not goof something up (code and repo settings)? :)

korken89 avatar Dec 28 '19 12:12 korken89

Friendly reminder for review

korken89 avatar Jan 07 '20 21:01 korken89

I took a brief look at the crate:

  • Do we want to commit to the 1.31+ MSRV policy? Which other WG crates use this policy? I'd be more in favor of a policy of supporting the last N Rust releases (usually N=3), since that allows using newer features without having to issue a breaking change.
  • There's no impl for 1-tuples (this can be useful for macros, and the standard library does provide these)
  • There's tuple impls for up to 16 elements, while the standard library only provides them for up to 12 elements. Why choose 16 here? Do we really need more than 12?
  • There are no impls for types that are only in libstd, do we want to provide them behind a use-std feature? They can't be added by downstream crates (unless they use a newtype wrapper).
  • It seems like the trait might make sense as part of bare-metal too, was this discussed anywhere? In general I think having fewer repos/crates reduces maintenance overhead. We can also reexport it there, which I think would be useful, but it's also the worst of both worlds (maintenance overhead is high, and now different teams have to coordinate as well).
    • This is also a question of whether we want to block publishing bare-metal 1.0 on a 1.0-quality Mutex trait. If not we can iterate in the mutex-trait crate until the trait is ready, then think about merging/reexporting from bare-metal.
  • Do we want an Exclusive(&mut T) type that implements Mutex as a no-op, like RTFM has? It seems to me that using drivers that are generic over T: Mutex<Data = Bus> would be pretty painful without that.

Except the possible merge into bare-metal (if we decide to do that) I think it's time to release 0.1 of this crate ASAP, or nobody will start building things with it, which is really what we need to further evaluate the trait.

jonas-schievink avatar Jan 13 '20 18:01 jonas-schievink

The "deadlock safe by construction" thing doesn't work out when the trait is implemented on immutable references by the way.

let a = core::cell::RefCell::new(0);

(&a).lock(|av| {
    *av += 1;

    (&a).lock(|_no| {});
});

It also forces one to use the (&a).lock syntax, which looks fairly alien.

jonas-schievink avatar Mar 10 '20 20:03 jonas-schievink

You can avoid writing (&a) by simply changing it to take the reference earlier, e.g.

let a = core::cell::RefCell::new(0);
let a = &a;

a.lock(|av| {
    *av += 1;

    a.lock(|_no| {});
});

roblabla avatar Mar 10 '20 21:03 roblabla