distributed-process icon indicating copy to clipboard operation
distributed-process copied to clipboard

[NT-4] Support `closeConnectionTo`

Open qnikst opened this issue 10 years ago • 8 comments

[Imported from JIRA. Reported by Edsko de Vries @edsko) as NT-4 on 2012-09-24 11:54:55] which closes the entire "bundle" of (outgoing and incoming) connections to another endpoint. Basically, "disconnect completely from this other endpoint" (a "heavyweight disconnect").

Once this is implemented we can resolve a TODO in Control.Distributed.Process.Node.

qnikst avatar Jun 17 '15 18:06 qnikst

We may want to write precise semantics for this method, if we want to proceed. Especially should connection be marked as Closed or Failed? Should be have a Closed event for each connection? What should happen if our side tries to open new connection after closeConnectionTo? What should happen if other side tries to open new connection after closeConnectionTo?

qnikst avatar Jul 01 '15 13:07 qnikst

What is the TODO? Looks like there is a related issue with d-p.

facundominguez avatar Jul 01 '15 15:07 facundominguez

seems like this one: https://github.com/haskell-distributed/distributed-process/blob/master/src/Control/Distributed/Process/Node.hs#L581-L589

qnikst avatar Jul 01 '15 22:07 qnikst

Notice that this feature request amounts to adding the same function as what d-p-tests expect in order to break connections between two endpoints for testing purposes. I guess this issue shows that a failRemoteEndPoint function could be useful not just for tests, and therefore should be part of the API proper.

All connections should get an ErrorEvent I think, basically the same semantics as whatever happens when testBreakConnection is called in d-p-tests TCP tests.

mboes avatar Jul 02 '15 07:07 mboes

Is it a same function or not depends on semantics, if all connections should have ErrorEvent and closed abnormally - then yes, this is the same function. However if we want "fast gracefull teardown" function than not.

I think that proposed by @mboes semantics is reasonable. And much simpler than teardown, if everybody agree that this should be a part of API, I could mark this task for inclusion in next major release, and fix that at some point.

qnikst avatar Jul 02 '15 07:07 qnikst

For the C.D.P.Node specific use case, a failed connection, as opposed to a closed connection, is the only reasonable semantics, from the moment that we want to consider that the remote node is "dead", as the comment says. A closed connection implies mutual agreement. A failed connection is a unilateral action. Which is what we want here: fail the connections unilaterally because we don't trust the remote endpoint any long since something went awry.

The point of this function is to provide a way to reset large amount of state, in the hopes that it will fix the situation. There is currently no way to do that. In light of this, instead of failRemoteEndPoint, we could alternatively provide a less bespoke, but more blunt action: resetEndPoint, which fails all connections everywhere and starts afresh. It seems odd however, that if A sends a bad request to B, that it is B that undergoes a drastic cleanup temporarily disappearing from the radar of all other nodes, when the problem probably comes from A.

mboes avatar Jul 02 '15 08:07 mboes

Wouldn't it make more sense to provide a call to close a single incoming connection? When processes die, they close their outgoing connections. I don't see why they wouldn't want to close their incoming connections as well.

facundominguez avatar Jul 02 '15 11:07 facundominguez

This ticket is about tearing down an entire heavyweight connection (i.e. many lightweight connections in one go) when a problem was encountered. And tearing it down without the cooperation of the other end, since the other end is deemed to be in a weird state from the moment that it is violating invariants (see comment linked to by @qnikst).

mboes avatar Jul 02 '15 12:07 mboes