Proposal: prefix-less websocket control via new frame type via http-over-websocket extensions
Currently to use websocket-over-http with GRIP, you must negotiate the grip websocket extension which needs a way to distinguish normal messages (which should be sent to clients) and "control" messages, which instruct the gateway to take certain actions associated with the connection.
Currently this is done by the proxy and server agreeing on a message prefix for each message type (by default m: and c:)
via websocket text frame
m:this is a normal message
c:{"type": "subscribe", "channel": "mychannel"}
via websocket-over-http
HTTP/1.1 200 OK
Content-Type: application/websocket-events
TEXT 2F\r\n
m:this is a normal message
TEXT 2F\r\n
c:{"type": "subscribe", "channel": "mychannel"}\r\n
While for most cases, if you know you are never sending anything but JSON (which will never start with c:) it is safe to negotiate the non-control prefix to an empty string.
However, this implementation still has a few downsides:
- if you do use a prefix its generally just kind of confusing / adds complexity
- If you negotiate an empty string, you can footgun yourself if you end up sending some text that happens to start with
c:
Ideally, we could separate control messages by introducing a new frame type. While possible in the websocket spec (though the other possible frame values are technically "reserved"), most user facing websocket libraries do not support non-standard frame types. As such this is not particularly practical for websocket proxying.
However, since websocket-over-http is the vastly more common usecase for websockets + GRIP, and the spec is much smaller, and easier to extend, it is possible to do it here.
Proposal
WebSocket-Over-Http-Extensions header
During the initial connection handshake between a GRIP proxy and a origin, in addition to Sec-Websocket-Extensions, proxies can additionally send a WebSocket-Over-Http-Extensions header. Websocket-over-HTTP extensions can make changes to the WebSocket-over-HTTP protocol, such as adding a new frame type.
The grip-events WebSocket-over-HTTP extension
This extension should be negotiated alongside the grip websocket extension. If both are present, no message prefix is to be used an TEXT events should be consider all "non-control" messages.
A new event is introduced, CONTROL, which contains text which should be interpreted exactly as messages with the control prefix were interpreted.
POST /target HTTP/1.1
Connection-Id: b5ea0e11
Content-Type: application/websocket-events
Sec-WebSocket-Extensions: grip
WebSocket-Over-Http-Extensions: grip-events
OPEN\r\n
HTTP/1.1 200 OK
Content-Type: application/websocket-events
Sec-WebSocket-Extensions: grip
WebSocket-Over-Http-Extensions: grip-events
OPEN\r\n
CONTROL 2F\r\n
{"type": "subscribe", "channel": "mychannel"}\r\n
CONTROL 2F\r\n
{"type": "keep-alive", "content": "{}", "timeout": 30, "mode": "interval"}\r\n
TEXT 2F\r\n
hi client\r\n
Curious on people's thoughts!
Something like this could be reasonable to do. Basically it could make the 99% case more ergonomic.
I mostly have nitpicks about the details:
CONTROLfeels a little generic, at least if we're going to maintain the stance that WebSocket-over-HTTP and GRIP are separate protocols.WebSocket-Over-Http-Extensionsis a super long header name.- Currently, all headers defined by the WebSocket-over-HTTP protocol don't include common prefixing/suffixing, e.g.
Keep-Alive-Interval,Meta-*,Content-Bytes-Accepted, etc. I can't remember why exactly. Maybe for brevity under the assumption everything is scoped to theapplication/websocket-eventscontent type.
One way to help guide the design could be to imagine how prefix-less control messages might work with regular WebSockets, even if that's never implemented. For example, I can imagine a hypothetical custom WebSocket frame type/opcode GRIP-CONTROL, with its use naturally determined via a WebSocket extensions negotiation. Alternatively, a TEXT frame could be used with one of the RSV bits set. A new opcode feels like a better fit though. I think RSV bits are more for extensions that could apply to multiple opcodes.
With that in mind, maybe the use of a GRIP-CONTROL frame could be negotiated like this:
Sec-WebSocket-Extensions: grip; control-messages=true
And then WebSocket-over-HTTP could have a GRIP-CONTROL event defined to match.
However, negotiating WebSocket extensions with custom opcodes doesn't mean a WebSocket-over-HTTP implementation will know what to do with them. This is a general issue with the protocol already though. Extensions other than grip would have the same problem. A good thought experiment could be to imagine another hypothetical extension with a custom opcode to help guide what would be needed to support both. For example, a heartbeat extension for exchanging HEARTBEAT frames. Maybe WebSocket-Over-Http-Extensions: {ws-extension-name}-events is the right direction.
That said, I'm not aware of any WebSocket extensions that use custom opcodes. If there isn't much concern about supporting other potential extensions at the same time as GRIP, then an alternative way to negotiate support could be the content type, such as application/websocket-events+grip. This would address my concerns about header name style by simply avoiding a new header entirely. Not sure if it should be done this way, just thinking out loud.
thanks for the response
GRIP-CONTROL sounds good to me
Currently, all headers defined by the WebSocket-over-HTTP protocol don't include common prefixing/suffixing, e.g. Keep-Alive-Interval, Meta-*, Content-Bytes-Accepted, etc. I can't remember why exactly. Maybe for brevity under the assumption everything is scoped to the application/websocket-events content type.
I was also curious about this because of the risk of conflicting with existing application headers. I don't imagine this would happen too much but it's possible. (has this ever happened before with a fanout customer?) The WebSocket spec for example chose a rather longer, but explicit prefix: Sec-WebSocket-*
But anyways that part can't be changed now and isn't too much of a big deal
I kinda like application/websocket-events+grip though it doesn't compose very well if you want to have multiple "extensions" which makes it much less attractive imo
Sec-WebSocket-Extensions: grip; control-messages=true
I don't know if I love the idea that this means different things depending on the protocol
ie premessage deflate isnt handled differently between websockets and websocket-over-http afaik, so it makes more sense here to me
also just thinking out loud
premessage deflate isnt handled differently between websockets and websocket-over-http afaik
I hadn't thought about it, but this extension wouldn't work either because it requires setting the RSV1 bit on frames which WebSocket-over-HTTP has no way to convey. Not a huge deal as at this layer HTTP compression can be used instead. Of course, an argument could be made that it might be nice to forward compressed message content to avoid recompressing.
Since WebSocket-over-HTTP is already a little bit limited for the sake of convenience, and since there is no documented WebSocket extension using custom opcodes, my sense is it's fine if the solution here is limited to our one extension. As such, I'm leaning towards the +grip content type.
In the unlikely chance we need to support multiple extensions with custom opcodes, perhaps there could be yet another content type:
Sec-WebSocket-Extensions: foo, bar, baz
Content-Type: application/websocket-events+custom
Custom-Events: foo, bar
Or maybe a mapping could be shared, so the WebSocket-over-HTTP impl doesn't need to understand the extensions:
Custom-Event-Mapping: FOO-TYPE-A=3, FOO-TYPE-B=4, BAR-TYPE=5
It could also allow specifying RSV bits:
TEXT 5 RSV1
hello
I think it would be fine if we never did any of that though.
yea no need to bikeshed here, I can barely even imagine other extensions as is (and if they come up an additional content type seems like a decent solution)
Content-Type: application/websocket-events+grip + GRIP-CONTROL sounds good to me
currently if you negotiate grip, is the content of the messages you send to /publish parsed in the same way?
If so, would this new way also need a way to send control commands over EPCP? Maybe a control "format"?
would this new way also need a way to send control commands over EPCP? Maybe a control "format"?
Control commands already can't be published, so nothing to update here. Published messages beginning with "c:" are delivered to the client.