piraeus-operator
piraeus-operator copied to clipboard
Automatic restore of evicted nodes
If node come offline for 2 hours then return back it stays EVICTED even when linstor-satellite succefully started.
Shouldn't we recover such nodes automatically by piraeus-operrator?
Yeah, you are right. I think restore got added a little later after the initial evict, so this was just missed.
Automatic restoration still does not work to me. @WanzenBug were you succeed in testing this?
Yes, I did test that successfully. But let me try again....
@WanzenBug sorry to disturb, didn't you check that yet? We have some users complaining on this issue
Sorry, haven't checked it yet. Feel free to ping me again if I don't respond by next week
@WanzenBug
Hello, I raised the test stand, started dropping the network (off/on the interface for 30s) and got a strange behaviour, the node itself does not return to online:
╭──────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════╡
┊ node-drbd-1 ┊ SATELLITE ┊ 192.168.0.151:3367 (SSL) ┊ Online ┊
┊ node-drbd-2 ┊ SATELLITE ┊ 192.168.0.152:3367 (SSL) ┊ OFFLINE ┊
┊ node-drbd-3 ┊ SATELLITE ┊ 192.168.0.153:3367 (SSL) ┊ Online ┊
┊ piraeus-cs-controller-5cf89dd5d4-dlktz ┊ CONTROLLER ┊ 10.42.2.13:3367 (SSL) ┊ OFFLINE ┊
┊ piraeus-cs-controller-5cf89dd5d4-lwzrk ┊ CONTROLLER ┊ 10.42.0.21:3367 (SSL) ┊ Online ┊
╰──────────────────────────────────────────────────────────────────────────────────────────╯
last message on controller:
07:51:14.930 [SslConnector] INFO LINSTOR/Controller - SYSTEM - Remote satellite peer /192.168.0.152:3367 has closed the connection.
the node is alive, the satellite is alive on it, with its hands it is transferred online if the reconnect is called
@D1abloRUS This does sound like an issue that was recently fixed in LINSTOR, where the controller "forgot" to reconnect if the connection was dropped during the initial handshake. See https://github.com/LINBIT/linstor-server/commit/bbc27839ad69025b437696cd28598f1db3b80e77
Fixed version was just released, just need to release the new images for piraeus.