ipc-channel icon indicating copy to clipboard operation
ipc-channel copied to clipboard

IpcReceiver hang, forking leaks file descriptors, problem/discussion

Open cehteh opened this issue 4 years ago • 1 comments

Related to #266 and possibly #201

I got a similar problem on forking with the 'daemonize' crate. The parent process hangs on recvmsg() when the child exits without sending any message. The expected outcome here would be that the recv() returns an `Err(Disconnected)', but it just hangs/blocks.

After some investigation I found following problem:

One does (with a hypothetical fork() that returns Child/Parent)

let (tx, rx) = ipc::channel::<u8>().unwrap();
match fork() {
    Child => std::process::exit(0);
    Parent => println!("{:?}", rx.recv(rx))  // This will hang instead returning Err(Disconnected)
}

With the result that we now have 2 processes each start with rx and tx open. Calling exit() on the child will close the descriptors in the child, but the sending side (tx) is still open in the parent itself, which causes it to hang in recv().

Actually a manual drop(tx); ... rx.recv() in the parent fixes the problem in this case (as one would do in C). Unfortunately this hack isn't always possible. Imagine your fork() is behind some API that takes an closure that takes ownership of tx.

Keeping 'tx' by cloning to drop it later wont help either. Internally it has an 'Arc' and thus it would leak the refcount only instead closing the fd.

A possible fix (immature, up for discussion) would be to add some facility to get an raw fd type that can be only closed.

let (tx, rx) = ipc::channel::<u8>().unwrap();
let closing_side = tx.closing_fd(); // fn closing_fd(&self) -> ClosingFD
spawn_process(move || { ...may exit... ; tx.send(...); } )   // this consumes tx and creates a child process
// parent continues here
closing_side.close();
let result = rx.recv();  // should not block now

ClosingFd stores only an RawFd and could have the methods

  • fn close(self) consuming self, closing the fd
  • fn forget(self) consuming self, but not closing it
  • may implement Drop .. as close()?
  • discussion: do these methods need to be unsafe? they don't violate rusts memory safety but there are certain contracts which must be manually ensured. Otherwise operation on an original descriptor may yield EBADF or even worse bad data because the closed fd got reused by the system. The whole issue here is about kernel resources which become out of sync with rusts view of the world because it is unaware about forking.
  • The whole thing will be needed for the other direction as well (parent sends to child)

cehteh avatar Sep 15 '21 22:09 cehteh

To the above I'd like to add the following Brainstorming (eventually, if approved, I can implement this and make a PR)

  1. Implement ClosingFD as above

  2. Let 'ipc::channel()' return a tuple-struct instead a plain tuple. This allows for an impl here and extending by foreign traits.

     struct TxRxPair<T>(pub IpcSender<T>, pub IpcReceiver<T>);
    
     pub fn channel<T>() -> Result<TxRxPair<T>, io::Error>
    

    Note that this is a breaking change any code that did something like

    let (tx, rx) = ipc::channel::<u8>().unwrap();
    

    needs to destructure the return:

    let TxRxPair(tx, rx) = ipc::channel::<u8>().unwrap();
    
  3. With this we can (schematic code)

    impl TxRxPair<T> {
        pub fn for_forking(self) -> TxRxForForking<T>
    }
    
    // ForForking includes the actual endpoint and the crossed ClosingFD
    struct TxRxForForking<T>(pub ForForking<T, F>, pub ForForking<T, F>);
    struct ForForking<T, F> { open_end: F<T>, closing_end: ClosingFD};
    
    impl ForForking<T,F> {
        // to be called after fork, closes the unneeded fd and returns the actual endpoint
        pub fn get(self) -> F {
            self.closing_end.close();
            self.open_end
        }
    }
    

    altogether this can then be used in a simple way:

      // or reverse TxRxForForking(parent, child)
      let TxRxForForking(tx_child, rx_parent) = ipc::channel::<u8>().unwrap().for_forking();
    
      match fork() {
          Child => {
              let tx = rx_child.get();
              tx.send(123);
          }
          Parent =>  {
              let rx = rx_parent.get();
              assert_eq!(rx.recv().unwrap(), 123);
          }
      }
    
  4. Since the datastructures are now real types other/foreign/extension crates may implement traits that extend these, for example to fork/daemonize a process.

cehteh avatar Sep 20 '21 17:09 cehteh