scylla-operator
                                
                                
                                
                                    scylla-operator copied to clipboard
                            
                            
                            
                        Automatically replace scylla node that looses storage
Is this a bug report or feature request?
- Feature Request
 
What should the feature do: When a node is decommissioned, the local storage is lost as well. Currently it requires a manual action by annotating the service to trigger replacement, otherwise the new pod is stuck on join as it doesn't have replace-address-first-boot set.
What is use case behind this feature: Stability - operator should be able to run scylla without any user intervention.
Additional Information: One option is to write down a file in an init container, check its presents and generate the replace-address-first-boot. Maybe there is a more sophisticated way to get the same information directly from scylla.
The case of a kubernetes node being decommissioned is covered by AutomaticOrphanedNodeCleanup although there is a race if the scylla node wouldn't be bootstrapped yet which would get stuck on scylla not replacing a node that's not in gossip.
There are other cases though AWS EC2 instances loose local disks when the instance is stopped, hibernated or a local drive fails. Same goes for GCP. Those won't result in a Kubernetes node removal so AutomaticOrphanedNodeCleanup won't help here and it can get stuck, needed a manual intervention to trigger a node replacement.
The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 30d of inactivity, 
lifecycle/staleis applied - After 30d of inactivity since 
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since 
lifecycle/rottenwas applied, the issue is closed 
You can:
- Mark this issue as fresh with 
/remove-lifecycle stale - Close this issue with 
/close - Offer to help out
 
/lifecycle stale
/remove-lifecycle stale /triage accepted