terraform-provider-tailscale icon indicating copy to clipboard operation
terraform-provider-tailscale copied to clipboard

FR: add ability to delete devices

Open rsyring opened this issue 3 years ago • 15 comments

Is your feature request related to a problem? Please describe.

When destroying resources with terraform, I'd like the ability to remove the device from Tailscale. For example, when an AWS instance will be deleted, I want it's associated tailscale device to also be deleted.

Describe the solution you'd like Keep track of devices in Terraform state and, when the device is removed, use the DELETE device API to remove the device from the tailnet.

rsyring avatar Feb 17 '22 04:02 rsyring

Hey @rsyring, so deleting a device is a little bit complicated when it comes to the terraform provider, as devices cannot be created using it.

A terraform provider typically describes the lifecycle of a resource, but a device cannot be a resource because it can't be created purely via API calls.

One suggestion might be that when the device_authorization resource is deleted, the device is deleted also. How would you feel about this solution?

davidsbond avatar Feb 21 '22 09:02 davidsbond

I managed to get a POC for this working on my branch but I must say I don't love the solution either. As @davidsbond said, devices cannot be created using it, so the provider has to track the changes that are happening elsewhere.

I made the create part of the provider ignore if the device isn't present, effectively giving it an "unavailable" marker. Whenever terraform runs and finds the device created, it will update its state with the correct ID. When the resource is deleted, it will be cleaned up using the device delete method.

provider "tailscale" {
  #  api_key = "my-api-key"
  #  tailnet = "my-tailnet"
}

resource "tailscale_device" "example" {
  name = "example.my-tailnet"
}

If you want, I can submit a PR for it.

pellegrino avatar Feb 22 '22 10:02 pellegrino

@pellegrino personally, I don't think this is a good way to go, if we can't actually create the device via API calls we shouldn't provide a resource for it.

I'd be willing to consider deletion occurring from deleting a device_authorization but I don't think we should add faux resources that do nothing.

For now my recommendation would be to use an ephemeral key that will remove the resource once the workload dies or restarts

davidsbond avatar Feb 22 '22 12:02 davidsbond

Just making the implicit explicit: we have a chicken and egg problem here between Tailscale's operational model and Terraform's. Since the models are currently incompatible, some kind of workaround is going to be needed.

FWIW, I like the idea of a provider even though it can't actually create the record. Could take the "wait_for" logic from #72 and apply it here as well. Put BIG WARNINGS in the docs that the device isn't actually created through the API and point to an example of how to get Tailscale installed and running on a new host at creation time. Could also add the warning and explanation to the error message, and a link to the docs, if the provider times out waiting for the Tailscale device to come online.

While I agree that this violates the spirit of what a provider is in this case, the implementation feels closer to what is "should be" and if Tailscale ever gives the ability to actually create the resource through the API, only the provider implementation would change. The terraform scripts would continue to work as-is, with potentially a deprecation notice in case they set the wait_for argument to a non-default value.

Having said all that, I wouldn't mind deleting through device_authorization. The delete optional so the case of de-authorizing but not deleting is still possible.

rsyring avatar Mar 02 '22 22:03 rsyring

I have a similar use case as @rsyring. Deleting the device through device_authorization would also work for me. We would execute the de-auth step before re-deploying the device with same name.

Using ephemeral key does not really fit since we install VMs (also terraform managed) that tend to reboot sometimes.

defo89 avatar Oct 10 '22 13:10 defo89

This workaround seems ok, we delete any existing devices in a remote-exec provisioner while creating a VM (openstack + ubuntu 22.04 for my case), with the appropriately defined vars for the keys.

provisioner "remote-exec" {
    inline = [
      "curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/jammy.noarmor.gpg | sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg >/dev/null",
      "curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/jammy.tailscale-keyring.list | sudo tee /etc/apt/sources.list.d/tailscale.list",
      "sudo apt-get update",
      "sudo apt-get -y install tailscale jq",
      # We want to make sure any Tailscale devices with this name have been deleted.
      <<EOT
      curl 'https://api.tailscale.com/api/v2/tailnet/[email protected]/devices' -u "${var.tailscale_api_key}:" |  \
         jq -r '.devices[] | select(.hostname == "${self.name}") | .nodeId' |  while read -r nodeid
         do
           #echo curl -x DELETE "https://api.tailscale.com/api/v2/device/$nodeid" -u "${var.tailscale_api_key}:"
           curl -X DELETE "https://api.tailscale.com/api/v2/device/$nodeid" -u "${var.tailscale_api_key}:" -v
         done
       EOT
      ,
      "sudo tailscale up --authkey=${var.tailscale_key}  --ssh"
    ]
  }

eloop avatar Nov 15 '22 08:11 eloop

A behaviour that maps onto existing Terraform workflows and resource lifecycles would be to: delete all machines using a key created through tailscale_tailnet_key, when said Terraform resource gets deleted.

ghost avatar Sep 19 '23 12:09 ghost

I've been trying to think of a way to handle this and I just can't think of a clean solution as it stand right now. I really think it needs an architectural change from Tailscale, like @mlangenberg mentioned in #232 Tailscale needs the ability to create a one time use key that associates a nodeId.

Once that's possible you can have a resource "tailscale_device" that creates a one time use key to be provided to something like cloud-init and has a nodeId associated that will be assigned to the node that uses the key. The key would be saved as part of the state and from Terraform's perspective it would be unchanging so cloud-inits won't be rebuilt, but it wouldn't be as sensitize because once it's used it is no longer active. The nodeId can then also be used elsewhere like device authorization. And since the tailscale_device would be a dependency of say a VM, when that VM is deleted it's dependencies will be to including the tailscale_device.

evilhamsterman avatar Nov 29 '23 18:11 evilhamsterman