websocket: fix bug where interleaved control frame breaks compression
There is a bug in at least versions 6.4 and 6.3 and I assume going back further than that, but have not verified:
If tornado receives a control frame while in the middle of processing a compressed multi-fragment data message, it corrupts the final message.
Consider tornado receiving the following sequence of websocket frames:
- data frame 0 [opcode
0x2] [final:0] [compressed:1] - data frame 1 [opcode
0x0] [final:0] - control frame [compressed:
0] [final:1] - data frame 2 [opcode
0x0] [final:1]
Background: the bit to indicate if a message is compressed only appears in the first frame. Tornado caches that flag in self._frame_compressed, and then when it receives the final frame it consults that instance variable to determine if the payload should be decompressed. The issue here is that if a control frame is received in the interim, the value of its compressed bit is used to overwrite self._frame_compressed
Per RFC7692:
An endpoint MUST NOT set the "Per-Message Compressed" bit of control frames and non-first fragments of a data message. An endpoint receiving such a frame MUST Fail the WebSocket Connection.
So self._frame_compressed was initially set to 1 when receiving the initial frame, then set to 0 when the control frame arrived, so then when the final data frame arrives and is sent for processing, Tornado believes it does not need to be decompressed when in fact it does.
This change does two things:
- Do not update
self._frame_compressedwhen receiving a control frame - Do not inspect
self._frame_compressedwhen processing a control frame. It should never need to be decompressed
Thanks! The fix looks good. Is it feasible to add a test? I guess we don't have good test infrastructure for manipulating individual frames like this.
I've used Autobahn (https://github.com/crossbario/autobahn-testsuite) for this before; I should revive that and get it in CI.