Pool Upgrade tried to restart, 1 year later
Looks like while in the process of upgrading to 20.04 an old "pool_upgrade" command reactivated and some nodes started writing NODE_UPGRADE txns to the config ledger, with "in process" status. See https://indyscan.indiciotech.io/txs/IND_DEMONET/config. for the output in indy_scan of the events.
Sequence of events that resulted in this "issue" being reported:
Sep 2, 2022 18:07:13 -> pool_upgrade command sent for entire Network
2 hours later 3 nodes still hadn't upgraded, not sure why. So...
Sep 2, 2022 20:26:36 -> pool_upgrade command sent for the 3 nodes that didn't upgrade
Sep 2, 2022 20:41:03 -> pool upgrade command completes with the last node of the three reporting "complete" for the upgrade
No other indication that anything has gone wrong happened until the first of these three nodes was started back up as a newly installed 20.04 node. That node registered that an upgrade was needed based on the commands sent a year previously (no new txn written to the ledger for "pool_upgrade" but it began writing txn's every 15 minutes stating that a node_upgrade was "in process")
The logs show a repeated occurrence of the following sequence:
upgrader.py:
I suggest that we research the proper "fail" command to return back to the node from the controller so that it writes the "fail" to the ledger properly and cleans things up. OR honor the timeout by writing a fail to the ledger after the timeout instead of simply trying again after timeout...