terraform-provider-talos icon indicating copy to clipboard operation
terraform-provider-talos copied to clipboard

Improve (documentation?) of `talos_machine_configuration_apply`

Open micheljung opened this issue 1 year ago • 7 comments

Say you have unconfigured Talos nodes and want to configure them using talos_machine_configuration_apply.

You read the Talos documentation and you understand that endpoint is the "entrypoint" (probably your control plane) and node is where you want to perform the operation on.

You might try something like this. Note that talos_machine_configuration_apply.worker.endpoint is set to the public IP of the control plane.

resource "talos_machine_secrets" "this" {}

data "talos_machine_configuration" "controlplane" {
  cluster_name     = "example-cluster"
  machine_type     = "controlplane"
  cluster_endpoint = "https://${hcloud_server.controlplane.ipv4_address}:6443"
  machine_secrets  = talos_machine_secrets.this.machine_secrets
}

data "talos_machine_configuration" "worker" {
  cluster_name     = "example-cluster"
  machine_type     = "worker"
  cluster_endpoint = "https://${hcloud_server.controlplane.ipv4_address}:6443"
  machine_secrets  = talos_machine_secrets.this.machine_secrets
}

data "talos_client_configuration" "this" {
  cluster_name         = "example-cluster"
  client_configuration = talos_machine_secrets.this.client_configuration
  nodes = ["10.0.0.101", "10.0.0.102"]
  endpoints = [hcloud_server.controlplane.ipv4_address]
}

resource "talos_machine_configuration_apply" "controlplane" {
  client_configuration        = talos_machine_secrets.this.client_configuration
  machine_configuration_input = data.talos_machine_configuration.controlplane.machine_configuration
  node                        = "10.0.0.101"
  endpoint                    = hcloud_server.controlplane.ipv4_address
}

resource "talos_machine_configuration_apply" "worker" {
  client_configuration        = talos_machine_secrets.this.client_configuration
  machine_configuration_input = data.talos_machine_configuration.worker.machine_configuration
  node                        = "10.0.0.102"
  # This is the mistake, because it's the public IP of the control plane, not the node
  endpoint                    = hcloud_server.controlplane.ipv4_address
}

resource "talos_machine_bootstrap" "this" {
  endpoint             = hcloud_server.controlplane.ipv4_address
  node                 = "10.0.0.101"
  client_configuration = talos_machine_secrets.this.client_configuration
}

This will successfully configure the control plane, but not the worker. When you think about it, it makes sense (we'll get to this). But before you realize this, you read the documentation, which says:

node (String) The name of the node to bootstrap

You might ask: "what do they mean by 'name'?" You also read:

endpoint (String) The endpoint of the machine to bootstrap

You might ask: "Why 'bootstrap'? Sure, I want to bootstrap, but isn't this the 'apply' resource that can also be used after bootstrapping? Was this copy-pasted?"

You might even have the situation (depending on the execution order of your talos_machine_configuration_apply) that your control plane is initialized as type "worker" and you end up reading #23. Now you don't know whether to doubt yourself or this provider 🙂

Finally, you realize that the target (worker) is not yet part of the cluster, so the control plane cannot communicate with it. Therefore, when using talos_machine_configuration_apply on an uninitialized node, endpoint needs to be the public IP of your target.

It would be great if the documentation could be more clear about this (and maybe other pitfalls - like how it behaves with endpoint and node of talos_client_configuration). Or maybe something can be changed in the way talos_machine_configuration_apply works.

micheljung avatar Sep 11 '24 14:09 micheljung

Also, to me, it's confusing that talos_machine_bootstrap also has endpoint and node. In which scenario is node not the same as endpoint?

micheljung avatar Sep 11 '24 14:09 micheljung

Wow, so that's why my changes were not being picked up by the workers. At this point it is disappointing to see how little thought was put into this provider and how barebones it feels.

GeorgeGedox avatar Sep 13 '24 05:09 GeorgeGedox

Yeah, the docs for v0.6 need some more details. Other examples:

machine_configuration_apply:

  • apply_mode - string: What are the options here and what is the default?
  • timeouts - struct: Most of the info in that struct just say "A string that can be parsed as a duration" instead of what the timeout is for and what the default is

cluster_kubeconfig

  • timeouts - struct: Same as above

hegerdes avatar Sep 21 '24 20:09 hegerdes

* apply_mode - string: What are the options here and what is the default?

I haven't had a use case for the apply_mode attribute yet, but I would assume it's the same as the --mode flag in talosctl apply-config: https://www.talos.dev/v1.8/reference/cli/#options

TomyLobo avatar Dec 09 '24 21:12 TomyLobo

Reading through this I am am still a little confused. I am using the the external address - for both. Still it's the apply on the worker that does not complete for me.

talos_machine_configuration_apply.worker[0]: Still creating...

What am I missing here?

resource "talos_machine_configuration_apply" "controlplane" {
  count                       = length(hcloud_server.control_plane)
  client_configuration        = talos_machine_secrets.this.client_configuration
  machine_configuration_input = data.talos_machine_configuration.controlplane.machine_configuration
  node                        = hcloud_server.control_plane[count.index].ipv4_address
  endpoint                    = hcloud_server.control_plane[0].ipv4_address
}

resource "talos_machine_configuration_apply" "worker" {
  count                       = length(hcloud_server.workers)
  client_configuration        = talos_machine_secrets.this.client_configuration
  machine_configuration_input = data.talos_machine_configuration.worker.machine_configuration
  node                        = hcloud_server.workers[count.index].ipv4_address
  endpoint                    = hcloud_server.control_plane[0].ipv4_address
}

resource "talos_machine_bootstrap" "this" {
  depends_on = [
    hcloud_server.control_plane
  ]
  client_configuration = talos_machine_secrets.this.client_configuration
  endpoint             = hcloud_server.control_plane[0].ipv4_address
  node                 = hcloud_server.control_plane[0].ipv4_address
}


resource "talos_cluster_kubeconfig" "this" {
  depends_on = [
    talos_machine_bootstrap.this
  ]
  client_configuration = talos_machine_secrets.this.client_configuration
  node                 = hcloud_server.control_plane[0].ipv4_address
}

resource "local_file" "talosconfig" {
  depends_on = [
    talos_machine_bootstrap.this
  ]
  content  = data.talos_client_configuration.this.talos_config
  filename = "${path.module}/../../.configs/${var.cluster_name}/talosconfig"
}

resource "local_file" "kubeconfig" {
  depends_on = [
    talos_cluster_kubeconfig.this
  ]
  content  = talos_cluster_kubeconfig.this.kubeconfig_raw
  filename = "${path.module}/../../.configs/${var.cluster_name}/kubeconfig"
}

tcurdt avatar Feb 06 '25 00:02 tcurdt

@tcurdt It looks like you're doing the exact same mistake I did. I updated my example to make it a bit more clear.

resource "talos_machine_configuration_apply" "worker" {
  ...
  endpoint                    = hcloud_server.control_plane[0].ipv4_address
}

should be

resource "talos_machine_configuration_apply" "worker" {
  ...
  endpoint                    = hcloud_server.workers[count.index].ipv4_address
}

Because (the way I understand it): if endpoint is the IP of the control plane, you're sending the command to the control plane. But the control plane can't forward it to your worker, because the worker has not yet joined the cluster, so they can't communicate.

Therefore, you need to command your worker directly, which is what you achieve by setting endpoint to the worker's public IP.

I hope this helps.

micheljung avatar Feb 06 '25 12:02 micheljung

@micheljung indeed it did ... thanks for pointing me at it

this is really a little weird

tcurdt avatar Feb 10 '25 23:02 tcurdt