fbp-protocol
fbp-protocol copied to clipboard
Graph receipt message type and message_id
This pull request adds a top level id to every message. By adding unique ids to every message we can have a runtime or ui respond to graph message with a receipt instead of just echoing back the original message to make sure the message was received. Echoing the messages causes some infinite looping issues when both a runtime and it's client are trying to update each other.
Yeah, we were just talking about this. Version 4 sounds good.
As you know, right now there is no way to know which messages are replies to what. But there is also a usecase for having 'undirected' messages coming from runtime: Supporting multiple clients.
This is something that maybe-kinda-halfway works right now, at least with some of the more solid runtime implementations (like NoFlo).
So when the runtime receives (and accepts) a message from client A, it should reply to A, but also send messages about the state change to other the connected clients (B,C...). I propose that these state changes notifications
(which are not replies), then do not have the id
message identifier. This is basically the same way it is done in JSON-RPC.
I definitely want to make sure this is compatible with multiple clients, and that is exactly the use case I have in mind with this proposal.
An example of the flow I was envisioning goes like this: let's say a client first adds a node to a graph. The client then sends an addnode
event to the runtime. When the runtime receives the node, it would first send back a reply using the receipt
message proposed here, then sends addnode
messages, each with their own id, to any other clients listening for changes to the same graph. The other clients would then be responsible for replying to the server with receipts of their own. At this point the runtime could be sure that all the clients listening to the relevant graph have been updated.
I think it would be nice if there is a distinction between the addnode
event, which is a command, and the reply event that informs the sender that the addnode
event that it sent was received by its intended recipient. We have a few other places in the protocol where we send a command like start
, and get back a different message, started
, that lets the sender know what happened without the intention of the message being ambiguous. It would be great if we could likewise clarify the intent of replies to these graph messages. I understand that we want to avoid breaking backwards compatibility as much as possible, but is anything actually depending on these identical messages being sent as replies? If that is the case maybe we can achieve the same thing in a way that does not break backwards compatibility.
So when the runtime receives (and accepts) a message from client A, it should reply to A, but also send messages about the state change to other the connected clients (B,C...). I propose that these state changes
notifications
(which are not replies), then do not have the id message identifier. This is basically the same way it is done in JSON-RPC.
The protocol defines commands and responses between exactly two peers. If the peer receiving the graph
command is connected to other peers, it can initiate a new command-response cycle with each of them, to ensure that they are up-to-date. In other words, graph
commands are bi-directional: they may be sent by a client or a runtime, and the response is always the same: a receipt
message. In this model, I don't believe there is a need for the concept of an "undirected" message.
Regarding breaking changes: there will be a handful of them coming up (proper handling of types will likely be a breaking change). I think this is only to be expected since we've added an active outside perspective. I don't want to cause unnecessary work for anyone, but I think we all agree there are things that need to be improved and extended. We might need to make a 0.6 release that defines and requires a proper handshake between peers, so that there's a known protocol for dealing with incompatible protocol versions (i.e. a client using protocol 0.5 should abort if it encounters a runtime using 0.6, and vice versa). Then we can bundle up the bigger changes to the protocol in a 0.7 release.
Why should the runtime wait for receipts from clients? And what would it do if it did not receive them, or received error? I don't see the reason for ACKing in this direction. The runtime is the one that holds the canonical state representation, and clients must simply respect changes that occur. Hence a notification / no-reply type message.
If we introduce breaking changes, we need to maintain parallel support protocol in clients until all runtimes are updated. And I would expect those that argue for breaking changes to contribute significantly to this work, as otherwise one has misaligned incentives (the cost of the breaking change is carried by someone else).
Why should the runtime wait for receipts from clients? And what would it do if it did not receive them, or received error?
I can think of a few possibilities:
- log the error
- disconnect from the runtime
- use this information (e.g. lack of a response or error response) to make informed decisions on how to synchronize multiple clients, such as resolving conflicting commands, or rolling back the failed change from other clients.
If a client gets out of sync with the runtime, this can lead to very bad things, especially with multiple clients compounding the problem. At best the runtime state is simply broken and can't be executed, at worst, it can lead to unintended actions executed by the graph which are destructive. A runtime disconnecting from a client that responds with an error receipt
might be a valid strategy to avoid corruption. Ignoring the problem (by designing a system which excludes this information) isn't a great solution. Clients and runtimes can always choose to ignore receipts (I believe that until https://github.com/noflo/noflo-ui/pull/564, noflo-ui ignored all graph
responses), but they should at least have the information available.
Most of the fbp-protocol adheres to a distinction between runtimes and clients. A start
message can logically only come from a client, and a started
message only from a runtime. But graph building is more like a multi-master database scenario, where changes can emanate from either endpoint, and keeping all processes in sync is paramount. Even though only one copy of the graph (the runtime's) will be executed, from a graph integrity perspective, all graphs are equally important because an error in a client graph can easily propagate problems back to the runtime graph (e.g. through a user who is now misinformed about the state of the graph). Thus from a protocol point of view, I think the graph sub-protocol looks more like a peer-to-peer system.
Our arguments for this design are:
- more information is better than less information
- symmetrical peer-to-peer design is more intuitive and easier to describe in words and schema
Regarding the latter, here's our proposal in words:
"Every graph edit starts with a graph
command message, and should be responded to with a receipt
containing the message id of the command, and an optional error message".
Your is something like:
"Graph commands coming from the client should be responded to by the runtime with a receipt
containing the message id of the command, and an optional error message. Graph commands coming from the runtime should not be responded to by the client."
If I've made a mistake there, please correct me. I'm not sure how the notification
message that you mentioned plays into this.
By the way, we greatly appreciate you engaging us in this debate. I know it takes a lot of effort to get everyone's mental model in sync (it's a peer-to-peer system, I think ;) I think the forthcoming changes are really going to benefit from everyone's perspective.
Just my two cents: hot-code reloading is a big feature for developers (esp. front-end). If I understand right, https://github.com/noflo/noflo-ui/pull/564 depends on this PR, and once both are merged, they would enable noflo-ui's graph to update based on the runtime's state. Very much looking forward to this. If there are any low-priority/tedious tasks to do, I'm happy to help.