demikernel [inetstack] `dead_socket_tx` Causes Unit Test Regressions to Fail

Running test_connect currently fails with:

Failed to terminate connection: TrySendError { kind: Disconnected }
thread 'protocols::tcp::tests::test_connect' panicked at 'Failed to terminate connection: TrySendError { kind: Disconnected }', src/protocols/tcp/established/background/mod.rs:58:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/ca82264ec7556a6011b9d3f1b2fd4c7cd0bc8ae2/library/std/src/panicking.rs:493:5
   1: core::panicking::panic_fmt
             at /rustc/ca82264ec7556a6011b9d3f1b2fd4c7cd0bc8ae2/library/core/src/panicking.rs:92:14
   2: core::result::unwrap_failed
             at /rustc/ca82264ec7556a6011b9d3f1b2fd4c7cd0bc8ae2/library/core/src/result.rs:1355:5
   3: core::result::Result<T,E>::expect
             at /home/gatowololo/.rustup/toolchains/nightly-2021-05-10-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:997:23
   4: catnip::protocols::tcp::established::background::background::{{closure}}
             at ./src/protocols/tcp/established/background/mod.rs:56:9

This line:

        dead_socket_tx
            .unbounded_send(fd)
            .expect("Failed to terminate connection");

As the error message states, the unbound_send always fails because the receiver is disconnected. So we always panic here.

Looking at the code. It seems the channel pair is created here:

impl<RT: Runtime> Peer<RT> {
    pub fn new(rt: RT, arp: arp::Peer<RT>, file_table: FileTable) -> Self {
        let (tx, _rx) = mpsc::unbounded();
        let inner = Rc::new(RefCell::new(Inner::new(rt.clone(), arp, file_table, tx)));
        Self { inner }
    }
...
}

As you can see the receiver is immediately dropped. So I don't see how this code is every supposed to not fail. Or what is the point of it since it doesn't seem to do anything?

Jun 21 '21 19:06 gatoWololo

Ah, I think I probably just forgot to implement this logic...

So, I think the idea was to have the closer coroutine maintain a reference to this channel and send its FD on the channel when TCP closes the connection. then, the Peer could have a coroutine that reads off this channel and cleans up all of its associated state. For example, the Peer would remove the socket from its sockets map, the established (or whichever one is relevant) map, and return its FD to the file table.

Jun 22 '21 06:06 sujayakar

The motivation for having this coroutine at all is that the per-socket coroutines can't have access to the Peer's maps, so they can't clean up their own state directly.

Jun 22 '21 06:06 sujayakar

I think we need to restructure most of this to allow things to cleanup properly. And rename TcpPeer to something that makes sense.

Mar 29 '22 06:03 BrianZill

demikernel demikernel copied to clipboard

[inetstack] `dead_socket_tx` Causes Unit Test Regressions to Fail

demikernel
demikernel copied to clipboard