Kyle Kingsbury
Kyle Kingsbury
Could you tell me a bit more about why you suspect jetcd is to blame here, rather than etcd itself? Here's a protocol-level view of the same kind of anomaly....
Begging your pardon, but are you saying that a transaction whose guard clause evaluates to `false` should go on to evaluate the transaction's `success` block? That seems to be at...
OK, so... to repeat, we have multiple cases where a transaction's compare failed, but etcd apparently executed the success block *anyway*--or if it didn't "really" execute, other transactions are able...
> If TxnResponse has the header (cluster_id, member_id, revision and raft_term), then it must contain the field succeeded, because its type is bool and it always has a value no...
Yup! That's what Jepsen is for--fault injection testing. You might recall etcd contracting me to do this same kind of work in 2019. :-)
I mean yeah, sure, hardware is supposed to be perfect! However, non-ECC machines, disks, faulty network controllers, bad VM hypervisors, et al do occasionally cause bit-flip errors. Given that etcd...
I think that's a great idea! One thing you could do is to modify the generator so that it keeps track of likely balances (use the new `update` function for...
I'd like to second this confusion: the fact that Carmine (permanently?) squirrels away connection pools via memoization makes it difficult to tell when and how you can close a connection...
I'm not sure if this is precisely the same bug or a different one, but I've managed to reproduce a crash with a slightly different error message purely from process...
If it helps narrow things down, the workload I'm using that crashes etcd is kv reads, writes, and transactions over an exponentially-distributed pool of keys. Roughly 500-1000 ops/sec as well,...