talos icon indicating copy to clipboard operation
talos copied to clipboard

Sponsored: backing up and recovering machine configuration

Open andrewrynhard opened this issue 3 years ago • 3 comments

Description

Another nice thing to have is the ability to reapply the last known good machine configuration to get back the node to a working state. Save the machine-config.yml when applying a new one, and provide a way to use the backup if the boot went wrong.

andrewrynhard avatar Dec 22 '21 16:12 andrewrynhard

Ideas:

  • API for rolling back to a last known configuration
  • Storing the machine configuration with each boot option (A, or B)

Try mode may handle some of this. Other parts might have to be manual?

See #4591, #4628

andrewrynhard avatar Dec 22 '21 17:12 andrewrynhard

Try mode is not enough for testing all the modification of a machine config. Some need a restart of the node which is prevented by the volatile nature of the try mode. Something simple and robust (manual) would be enough.

sbskas avatar Jan 05 '22 18:01 sbskas

Try mode is not enough for testing all the modification of a machine config. Some need a restart of the node which is prevented by the volatile nature of the try mode. Something simple and robust (manual) would be enough.

Bit of a necro, but agreed... "Try" is still too much "hit or miss" to rely on... At least when it comes to node-bricking.

Talos has 1 single actual weakness and thats the fact that a bad apply can cause a node to go down without a way to get it back-up without manual intervention on-site.

It would be nice having a KVM key combo that can be hit to force a revert to previous known-good config.

PrivatePuffin avatar Jun 06 '24 12:06 PrivatePuffin

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Dec 04 '24 02:12 github-actions[bot]

Didn't we do something with this?

DmitriyMV avatar Dec 06 '24 00:12 DmitriyMV

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jun 04 '25 02:06 github-actions[bot]

This issue was closed because it has been stalled for 7 days with no activity.

github-actions[bot] avatar Jun 09 '25 02:06 github-actions[bot]