RESP3 icon indicating copy to clipboard operation
RESP3 copied to clipboard

Optional checksum support

Open gbitzes opened this issue 6 years ago • 3 comments

Hello,

What is the feeling towards adding optional end-to-end checksum support to RESP3? The checksum in TCP is quite poor, and will often fail to detect corrupted network packets. This is not theoretical, and there's published papers that demonstrate it in real-world scenarios.

https://stackoverflow.com/questions/3830206/can-a-tcp-checksum-fail-to-detect-an-error-if-yes-how-is-this-dealt-with https://dl.acm.org/citation.cfm?id=347561&dl=GUIDE&coll=GUIDE

After an analysis we conclude that the checksum will fail to detect errors for roughly 1 in 16 million to 10 billion packets. ... Even so, the highly non-random distribution of errors strongly suggests some applications should employ application-level checksums or equivalents.

If a client sends "SET key foo", but the server receives "SET key f0o" due to network corruption, there's no way to detect anything went wrong. Similarly, when the server sends some data, there's no way to detect client-side if the contents were somehow corrupted.

It could work like this: On HELLO, the client could also request that all top-level requests and replies are prepended with a particular fixed-size checksum:

HELLO 3 CHECKSUM xxhash AUTH default blabla
->   <4 checksum bytes> Reply
<4 checksum bytes> SET abc 123
->  <4 checksum bytes>+ OK
<4 checksum bytes> HGET 123 abc
-> <4 checksum bytes><data in abc field of hash 123>

Clients not supporting checksums would still work - they'd simply not request them. When a checksum error is detected, either on the client side or the server side, we'd simply shut down the connection and treat it as a protocol parsing error, so the client will simply retry, just as if this was a transient network instability.

What do you think?

gbitzes avatar Feb 22 '19 19:02 gbitzes

I think some kind of protocol-level support for error checking is probably desirable, and ideally should allow for different ways of checking (e.g. checksums, MD5, etc.). I don't think that's the same as saying Redis itself should then implement checksums - there will probably be a performance cost if the server had to generate these checks for every response, given, for example, that some keys can be huge

AngusP avatar Jul 09 '19 11:07 AngusP

Does running via SSL have a similar benefit? i.e. better detection of corrupt packets?

ioquatix avatar Jul 10 '19 09:07 ioquatix

One could also argue that checksums can already be supported, using the attribute type, as these can be returned surplus to what the client was expecting back, e.g.

|1<CR><LF>
    +_checksum:md5<CR><LF>
    +bc6e6f16b8a077ef5fbc8d59d0b931b9<CR><LF>
*2<CR><LF>
    +some other stuff...<CR><LF>
    :9543892<CR><LF>

AngusP avatar Jul 10 '19 11:07 AngusP