CCF icon indicating copy to clipboard operation
CCF copied to clipboard

Session consistency does not hold across elections

Open achamayou opened this issue 3 years ago • 1 comments

If a user submits a stream of transactions across an election, they may observe an inconsistent session history as one or more transactions are rolled back.

We can mitigate this by closing user TLS sessions when an election is observed.

achamayou avatar Jun 17 '22 16:06 achamayou

Summarising my recent thoughts on this:

Closing all user TLS sessions during an election is a bad experience (dead connection with no explanation), and pessimistic (killing sessions which would not observe any inconsistencies).

I think we can precisely track/detect inconsistencies by recording every TxID that is reported on each session. Then during request execution, we can read the TxID of the previous response, and only proceed if it is still valid. If it has been rolled back, then we know we have lost the ability to present session consistency on this session, and need to close it, but can first return a clear HTTP error for this request to the user.

We need to work out when it is safe to check the validity of the previous TxID. If we do it too late, where we currently set the new TxID response header, then we've already committed writes from this session that may relied on (session-implicit) rolled back state. If we do it too early, it is possible that the previous TxID in question is rolled back between the point we ask and the point this request gets a read version. I think it is safe to do at any point between the transaction getting a read version and being committed, but in practice I think that means after the user-app handler has executed.

eddyashton avatar Jul 14 '22 13:07 eddyashton

Deeper discussion of this in #4401.

eddyashton avatar Oct 27 '22 08:10 eddyashton