network-transport icon indicating copy to clipboard operation
network-transport copied to clipboard

Include outgoing connection events in receive queue

Open avieth opened this issue 8 years ago • 3 comments

This is an alternative to, and intended to solve the same problem as, issue #33.

New Events

Three new event constructors are introduced:

data Event =
    ...
  | OutboundConnectionOpened ConnectionId Reliability EndPointAddress
  | OutboundConnectionClosed ConnectionId
  | OutboundConnectionSent ConnectionId [ByteString]

Or perhaps instead:

data Provenance = Local | Peer
  deriving (Show, Eq, Generic)

instance Binary Provenance

data Event =
    -- Replaces 'Received', which can now mean 'Sent'.
    Data Provenance ConnectionId [ByteString]
  | ConnectionOpened Provenance ConnectionId Reliability EndPointAddress
  | ConnectionClosed Provenance ConnectionId
  | ...
  deriving (Show, Eq, Generic)

pattern Received connid bss = Data Peer connid bss
pattern Sent connid bss = Data Local connid bss

and in any case:

data Connection = Connection {
    ...
  , connectionId :: ConnectionId
  }

Events with provenance Peer are the familiar incoming events, whereas events with provenance Local are generated in response to locally-initiated connection features: connect, send, and close.

Semantics of local-provenance Events

When connect ep address reliability hints succeeds, a ConnectionOpened Local connid reliability address must be posted exactly once to eps event queue, where connid is any suitable (unique) ConnectionId.

When close conn succeeds, a ConnectionClosed connid must be posted exactly once to the event queue of the EndPoint against which conn was created.

When send conn bs succeeds, a Data Local connid bs must be posted exactly once to the event queue of the EndPoint against which conn was created, even if the connection is not reliable (in which case the receiver may see different Data events than the sender for the same connection). If the connection is ordered, then Data Local events must be posted in the same order that the peer will observe the Data Peer events corresponding to these sends (it's assumed the concrete transport is capable of doing this, else it shouldn't allow an ordered connection). Regardless of reliability and ordering, a Data Local event must come after the ConnectionOpened for that connection, and before any event which terminates that connection (ConnectionClosed or error events which cause it to close).

Note that these rules imply that for self connections (from one EndPoint to itself), the EndPoint will see two of each posted event, one for each provenance (they are both Local and Peer).

The connectionId of a Connection must be unique among incoming and outgoing connections, but need not match the ConnectionId which the peer uses for that same connection.

It solves the problem brought up in #33

There is now an ordering on outgoing connections. If we observe ErrorEvent (TransportError (EventConnectionLost addr) _) then every ConnectionOpened Local connid _ addr which is unmatched by a ConnectionClosed Local connid is known to be broken, and subsequent ConnectionOpened Local connid' _ addr are not broken by that error event.

avieth avatar Feb 22 '17 19:02 avieth

In the TCP impl we will need to review carefully to make sure we can indeed guarantee the ordering of connection open/close events vs endpoint lost event. In the TCP impl I suspect that this means we need to hold the mvar for the state of the remote endpoint while we post a conn open/close or a endpoint lost event, and thus we will need to support a non-blocking qdisc enqueue operation.

dcoutts avatar Feb 23 '17 00:02 dcoutts

In the TCP impl we will need to review carefully to make sure we can indeed guarantee the ordering of connection open/close events vs endpoint lost event. In the TCP impl I suspect that this means we need to hold the mvar for the state of the remote endpoint while we post a conn open/close or a endpoint lost event, and thus we will need to support a non-blocking qdisc enqueue operation.

I have a proposal for this. It's all about tcp but I'll describe it here in this network-transport ticket anyway. Currently we have handleIncomingMessages, which sources the socket and sinks into the QDisc. We can have another thread associated with that heavyweight connection (socket), handleOutgoingEvents, which sources a Chan (populated by calls to connect, send, and close) and sinks into the same QDisc. The threads are synchronised when the heavyweight connection goes down: once they both have finished and will no longer enqueue anything, the ConnectionLost event can be posted (or not, if it went down normally).

Sadly this means that the sent data will be kept in memory until the corresponding Data Local event is dequeued and thrown away.

Order of events

We haven't exactly specified what's required of a QDisc, but thinking about this demands a specification of event ordering so I'll talk about the two at once. My proposal for a law that characterises a legitimate QDisc:

Assuming: If A < B then B is not enqueued before A. Prove that: If A < B then B is not dequeued before A

Any FIFO QDisc is a legitimate QDisc: if A < B then B is not enqueued before A by assumption, therefore B is not dequeued before A by definition of FIFO.

But even non-FIFO QDiscs can be legitimate, because the order < is partial, the transitive closure of something like this:

1. EndPointClosed > EndPointFailed
2. EndPointFailed > TransportFailed
3. forall peer . TransportFailed > ConnectionLost peer
4. forall peer . TransportFailed > ReceivedMulticast peer

-- A connection stands for a peer, a connection identifier, and a provenance.
-- Outgoing and incoming connections are considered different; numbers 6
-- and 7 impose no order on two connections of different provenance.

5. forall connection . ConnectionLost (Peer connection) > ConnectionClosed (ConnectionId connection)
6. forall connection . ConnectionClosed (ConnectionId connection) > Data connection
7. forall connection . Data (ConnectionId connection) > ConnectionOpened (ConnectionId connection)

This isn't the full story: there's also an ordering on heavyweight connections to ensure that you can't jumble the order of a sequence of ConnectionLost events for the same peer.

It's nt-tcp's responsibility to ensure that it satisfies that antecedent in the QDisc law, i.e. that it posts events in network-transport-specified order. Since events for different heavyweight connections are not comparable, using a separate thread for each one and therefore not having a determined order on posting of these events is completely fine (but we knew that already). All events for a particular heavyweight connection do have a determinate order and I'm quite certain nt-tcp respects it. Adding another thread to enqueue the local-provenance events (as I suggested above) is also fine, because local- and remote-provenance events for the same peer are not comparable.

avieth avatar Feb 23 '17 19:02 avieth

Another somewhat radical option is to give a separate egress queue.

-- Same as is, just a renamed type.
receive :: EndPoint -> IO IngressEvent

-- Runs the continuation for the next egress event. The corresponding API call
-- (connect, close, send, multicast stuff) will unblock when the continuation
-- finishes.
deliver :: EndPoint -> (EgressEvent -> IO t) -> IO t

Just as a typical application has a single receiving thread, there would also be a single delivery thread:

-- The simplest delivery thread: just clear it as fast as possible and quit when it's
-- impossible to continue.
delvieryThread ep = receive ep $ \egressEvent -> case egressEvent of
    EndPointClosed -> return ()
    ErrorEvent (TransportError EventEndPointFailed _) -> return ()
    ErrorEvent (TransportError EventTransportFailed _) -> return ()
    _ -> deliveryThread ep

It solves the original problem: the thread which consumes the egress queue will be able to determine which outgoing connections have been severed by an EventConnectionLost event (assuming this event comes out of both the ingress and egress queue).

That's what it's all about really: the EventConnectionLost doesn't concern only incoming connections, yet it's tucked away in the receive queue, temporally unrelated to the free-for-all of connects. In fact, all ErrorEvents are relevant to outgoing connections, yet appear in the types to be unrelated.

avieth avatar Apr 24 '17 23:04 avieth