miri icon indicating copy to clipboard operation
miri copied to clipboard

Make file descriptors into refcount references to "file descriptions"

Open RalfJung opened this issue 1 year ago • 3 comments

Currently, every file descriptor has to implement dup, and some do that incorrectly. The issue is that file descriptors are more like Arc than like Box, so all state should be shared among all the duplicates, which typically requires an extra level of indirection -- e.g. in eventfd or epoll, where duplicate file descriptors should reference the same underlying state.

Instead of making each FD type handle that, we should remove dup from the FileDescriptor trait and handle this ourselves. I see two ways to do that:

  • Literally use Arc<dyn FileDescription> instead of Box<dyn FileDescriptor>, so Arc handles the refcounting for us for free. However this will require interior mutability if the file description tracks mutable state (as many of the "special" FDs will).
  • Add some sort of FxHashMap<FileDescriptionId, (usize, Box<dyn FileDescription>)> (where the usize is a refcount) and then make the FD table simply reference FileDescriptionId. This means we have to do refcounting ourselves but we can get exclusive access to the data inside a file description.

Interior mutability is annoying, but then there is some chance we'll need to give the read/write methods access to the machine (even mutable access!) and then it seems hard to avoid some form of interior mutability. So currently my gut feeling is that manual refcounting isn't worth it.

RalfJung avatar Apr 28 '24 07:04 RalfJung

@ZoeS17 you were looking for some reasonably sized contribution to get your feet wet in the codebase, weren't you?

RalfJung avatar Apr 28 '24 07:04 RalfJung

We could hide the interior mutability by providing a FileDescriptor struct that hides the Rc<RefCell<dyn Filedescription>> and provides the right mutability API and forwards it to the inner one

That would statically prevent us using writing APIs in read only code, while internally allowing read ops on FileDescription to take mutable references

oli-obk avatar Apr 28 '24 11:04 oli-obk

One downside of the Arc version is that we do not have access to the Machine in drop. So we'd have to carefully align things to make such access unnecessary.

But we can always try Arc first and then migrate to manual refcounting later if it becomes clear that Arc does not work.

RalfJung avatar Apr 28 '24 18:04 RalfJung

Hello! I'm new to miri and I'm eager to contribute. May I take on this issue?

Luv-Ray avatar May 01 '24 04:05 Luv-Ray

That's great to hear, thanks! Let's wait a bit to see if @ZoeS17 wants to take it, as they were already looking for something a good month ago.

We have a bunch of issues labeled "good first issue", if you want to take a look. They vary quite a bit in difficulty and our mentoring capacity is limited, but it's always worth asking.

RalfJung avatar May 01 '24 09:05 RalfJung

Actually we already offered ZoeS an issue here and they didn't reply, so -- for me it's fine if you take this one, @Luv-Ray. Unfortunately I won't be able to do primary mentoring currently.

RalfJung avatar May 01 '24 10:05 RalfJung

I don't mind if they take it. I've been a bit busy and kindof out of the net. Mae culpa.

ZoeS17 avatar May 01 '24 10:05 ZoeS17

No worries, this is not your job after all :)

On May 1, 2024 10:43:13 AM UTC, Zoe @.***> wrote:

I don't mind if they take it. I've been a bit busy and kindof out of the net. Mae culpa.

-- Reply to this email directly or view it on GitHub: https://github.com/rust-lang/miri/issues/3525#issuecomment-2088276293 You are receiving this because you authored the thread.

Message ID: @.***>

RalfJung avatar May 01 '24 10:05 RalfJung

I attempted to use Rc<RefCell<dyn FileDescriptor>>, but I encountered an issue with the close function. There is one dyn FileDescriptor with mutible reference, so we should ensure that the real close is only called once. However, the use of self: Rc<RefCell<Self>> as function parameter is not allowed.

fn close<'tcx>(
    self: Rc<RefCell<Self>>, // Error: type of `self` must be `Self` or a type that dereferences to it
    _communicate_allowed: bool,
) -> InterpResult<'tcx, io::Result<i32>> {
    // check reference count first
    throw_unsup_format!("cannot close {}", self.name());
}

I have come up with three possible solutions now:

  1. remove close and impl Drop instead, but it's hard to get the InterpResult
  2. close get a &mut self as parameter, and check Rc::strong_count(&fd) == 1 before call close
  3. use Rc<dyn FileDescriptor> so that we could check reference count in close, but this will result in the need for impl FileDescriptor for RefCell<FileHandle> to get interior mutability.

Before making a decision, I would like to hear your opinions. Thank you for any suggestions.

Luv-Ray avatar May 02 '24 04:05 Luv-Ray

close get a &mut self as parameter, and check Rc::strong_count(&fd) == 1 before call close

This seems like the best option to me.

RalfJung avatar May 02 '24 08:05 RalfJung

The issue is that file descriptors are more like Arc than like Box, so all state should be shared among all the duplicates

Note that there's at least one bit of state that's per file descriptor, not per file description. The O_CLOEXEC flag.

the8472 avatar May 28 '24 14:05 the8472

Fair. We don't track that at all currently as we don't have exec.

RalfJung avatar May 28 '24 14:05 RalfJung