CCF
CCF copied to clipboard
Post DR have Retired nodes show up in node/network/removable_nodes API
At present after a DR the pre-DR nodes get reported as Retired in the node/network/nodes but they don't how up in node/network/removable_nodes. If we can have the nodes listed in removable_nodes api then from an orchestrator's perspective it as an indication that nodes is marked retired, its considered removable by CCF so operator can go ahead and DELETE the infra node and also invoke DELETE /node/network/nodes/nodeid. At present only nodes that go via the remove_node proposal show up in removable_nodes api. The logic to detect and cleanup stale nodes can then be the same:
- Wait for node to be reported as retired.
- Wait for node to appear in removal_nodes api.
- Delete node from infra.
- Invoke DELETE /node/network/nodes/nodeid API.
Further, post-DR the pre-DR nodes are in the KV still, but no longer in the consensus. Invoking the DELETE /node/network/nodes/nodeid api has no affect and does not remove these stale entries. The fix for this issue should provide a means to have these entries removed.
As discussed, the ideal behaviour would be: pre-DR nodes get set as retired_committed on DR, show in removable_nodes, and can be deleted as normal.
An acceptable alternative, if there are difficulties with this, would be to delete them altogether in the recovery transaction.
@achamayou any eta for this one?