jesterj
jesterj copied to clipboard
Allow multiple nodes to cooperate
This is a placeholder/parent ticket for the key feature of 2.0 when we get there. This will generally include:
- [ ] Cluster formation so that nodes can access a unified cassandra cluster.
- [ ] A means to pass documents among nodes (JavaSpace, Cassandra or otherwise)
- [ ] A means for newly started nodes to detect existing nodes and join
- [ ] A means for nodes to leave gracefully
- [ ] Handling of ungraceful node loss
- [ ] Loading and unloading of Plans without stoping the cluster.
Creating this ticket so I can note one difficulty we will face with Cassandra in the first of those. Here's a conversation from the ASF cassandra slack:
Ztyx 8:11 AM Hello! We have an application that executed a CREATE TABLE IF NOT EXIST ... on boot. A couple of months ago we hit a node schema disagreement (and the table already existed) and our suspicion was that it had to do with that query. Anyone else hit this?
Jeff Jirsa 8:22 AM Strictly not safe in current versions of cassandra to have multiple processes execute that command at the same time 8:23 It is, unfortunately, something that’s known, poorly documented, and has horrible horrible side effects, including potential data loss months later when you restart the instance 8:24 @Ztyx if you must have the app make tables, use external locking - like zookeeper or something
gus 8:49 AM @Jeff Jirsa is this only a problem when the table didn't exist and 2 start up or is there a potential problem regardless of whether the table exists? 8:54 Is this it: https://issues.apache.org/jira/browse/CASSANDRA-15844 ?
ASF JIRA BridgeAPP 8:54 AM CASSANDRA-15844: Create table Asynchronously or creating table contact the same node from many client threads at same time may causing data loss
Jeff Jirsa 9:32 AM The failure modes I know about involve diverging cfid so id expect it to be mostly around create 9:33 Wouldn’t be surprised if alter statements also cause problems, but it’d be like migration task storms and GC pressure not data loss 9:34 15844 describes one shape of what I mentioned is possible yes 9:35 The race can result in like a dozen different states (different permutations of the race). One involves the cfid in schema table not matching the cfid in the table path on disk, that’s the one where If you bounce you end up losing that data because cassandra makes the “right” empty data directory on startup