Backup and restore the Terraform state
We can build a command line tool to backeup and restore the Configuration Object and the Terraform state. The command line tool may have two subcommands: backup and restore.
The usage of the backup subcommand may like this:
$ tfc backup --help
Backup the Configuration Object, and if the configuration use the default Terraform kubernetes backend, backup the Terraform state too
Usage:
tfc backup [OPTIONS] CONFIGURATION_NAME
Examples:
tfc backup -d ./backup_dir configuration-oss-demo
Flags:
-d, --dir string Destination directory for saving the backed up files
-h, --help help for backup
Global Flags:
--as string Username to impersonate for the operation
--as-group stringArray Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
--cache-dir string Default cache directory (default "/Users/loheagn/.kube/cache")
--certificate-authority string Path to a cert file for the certificate authority
--client-certificate string Path to a client certificate file for TLS
--client-key string Path to a client key file for TLS
--cluster string The name of the kubeconfig cluster to use
--context string The name of the kubeconfig context to use
--insecure-skip-tls-verify If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
--kubeconfig string Path to the kubeconfig file to use for CLI requests.
-n, --namespace string If present, the namespace scope for this CLI request
--password string Password for basic authentication to the API server
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
--tls-server-name string Server name to use for server certificate validation. If it is not provided, the hostname used to contact the server is used
--token string Bearer token for authentication to the API server
--user string The name of the kubeconfig user to use
--username string Username for basic authentication to the API server
The restore subcommand can be used to restore the Configuration Object and it will also restore the Terraform state if the configuration uses the default Terraform kubernetes backend.
When the retore subcommand works, it will accept a yaml file or dir. It fetches the configuration definations from the yaml file or the dir, and check if the configurations use the default kubernetes backend, if the configuration does, the restore subcommand will read the state.json file and restore the bakcend secret before resotring the configuration objects.
The usage of the restore subcommand may like this:
$ tfc restore --help
Restore the Configuration Object, and if the configuration use the default Terraform kubernetes backend, restore the Terraform state too
Usage:
tfc restore OPTIONS
Examples:
tfc restore -f ./configuration.yaml
Flags:
-f, --from string A file or a directory from where the command read the source definitions
-h, --help help for restore
Global Flags:
--as string Username to impersonate for the operation
--as-group stringArray Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
--cache-dir string Default cache directory (default "/Users/loheagn/.kube/cache")
--certificate-authority string Path to a cert file for the certificate authority
--client-certificate string Path to a client certificate file for TLS
--client-key string Path to a client key file for TLS
--cluster string The name of the kubeconfig cluster to use
--context string The name of the kubeconfig context to use
--insecure-skip-tls-verify If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
--kubeconfig string Path to the kubeconfig file to use for CLI requests.
-n, --namespace string If present, the namespace scope for this CLI request
--password string Password for basic authentication to the API server
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
--tls-server-name string Server name to use for server certificate validation. If it is not provided, the hostname used to contact the server is used
--token string Bearer token for authentication to the API server
--user string The name of the kubeconfig user to use
--username string Username for basic authentication to the API server
Let's discuss a typical restore scenario.
Assume that we have a configuration.yaml like the following, and we use the default Terraform kubernetes backend (the state.json will be sotred in the same cluster as the Configuration Object).
apiVersion: terraform.core.oam.dev/v1beta1
kind: Configuration
metadata:
name: alibaba-oss-bucket-hcl
spec:
hcl: |
resource "alicloud_oss_bucket" "bucket-acl" {
bucket = var.bucket
acl = var.acl
}
output "BUCKET_NAME" {
value = "${alicloud_oss_bucket.bucket-acl.bucket}.${alicloud_oss_bucket.bucket-acl.extranet_endpoint}"
}
variable "bucket" {
description = "OSS bucket name"
default = "vela-website"
type = string
}
variable "acl" {
description = "OSS bucket ACL, supported 'private', 'public-read', 'public-read-write'"
default = "private"
type = string
}
backend:
secretSuffix: oss
inClusterConfig: true
variable:
bucket: "vela-website-20211130-1900-51"
acl: "private"
writeConnectionSecretToRef:
name: oss-conn
namespace: default
After we apply the configuration.yaml, the cloud resource (the oss bucket named vela-website-20211130-1900-51) will be created, and we can check the status of the cofiguration:
$ kubectl get configuration.terraform.core.oam.dev
NAME STATE AGE
alibaba-oss-bucket-hcl Available 13s
Now, we can backup the configuration alibaba-oss-bucket-hcl and the terraform state to the local file system, and their filenames are configuration.back.yaml and state.back.json.
Next, we assume that the kubernetes cluster has a disaster and is no longer avaialbe. We need to restore the configuration to a new kuberentes and don't recreate the cloud resource (the oss bucket is this scenario).
First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.
Second, we should restore the configuration object. We can just apply the configuration.back.yaml.
@loheagn What are the exact procedures for restoring the state?
First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.
@loheagn What are the exact procedures for restoring the state?
First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.
-
Read the
cofiguration.back.yamland get the meta (namespace and name) of the backend secret which should be created. -
Read the
state.back.jsonand use the content (the Terraform state) to build the backend secret. Thedatashould have a keytfstateand the value is the encoded Terraform state string.
@loheagn What are the exact procedures for restoring the state?
First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.
- Read the
cofiguration.back.yamland get the meta (namespace and name) of the backend secret which should be created.- Read the
state.back.jsonand use the content (the Terraform state) to build the backend secret. Thedatashould have a keytfstateand the value is the encoded Terraform state string.
Any executable commands for step 1 and 2. And how do you verify your restore is successful? Append any evidence for it please.
Hi, @zzxwill , I created a command tool to show how to resotre the state. You can review the code here.
You can just run go run main.go resotre -h for help. And I will add examples and the README later.
@loheagn Please also take a look at this requirement.
And how do you verify your restore is successful? Append any evidence for it please.