terraform-provider-digitalocean icon indicating copy to clipboard operation
terraform-provider-digitalocean copied to clipboard

Feature Request: Don't run remote-exec provisioners on droplets until cloud-init has finished

Open syntaqx opened this issue 6 years ago • 4 comments

When using remote-exec provisioners, I often run into issues where cloud-init has not finished running on the source droplet by the time any custom commands would run from remote-exec provisioners, requiring me to preface every droplet with a template like the following:

resource "digitalocean_droplet" "leader" {
  name = "example"
  region = "nyc1"
  image = "docker-18-04"
  size= "s-1vcpu-1gb"

  connection {
    host        = self.ipv4_address
    user        = "root"
  }

  provisioner "remote-exec" {
    inline = ["while [ ! -f /var/lib/cloud/instance/boot-finished ]; do echo 'Waiting for cloud-init...'; sleep 1; done"]
  }

  // ...actual remote-execs can now go here...
}

Note: This is actually wrapped into a module for ease-of-use but the inability to pass provisioners to modules only amplifies the pain I experience from this.

Additionally, chained droplets that require packages to be installed (a simple example being a local ntp server master/worker relationship via user-data) cannot talk to eachother without a bunch of wait scripts built into every command executed.

It would be awesome to not need this and be able to either pass a variable to the resource such as:

resource "digitalocean_droplet" "example" {
   // ...

    wait_for_cloud_init = true

    provisioner "remote-exec" {
        inline = ["echo hello"]
    }
}

Or just have that be the default functionality when a droplet resource is considered "completed"

Terraform Version

$ terraform -v
Terraform v0.12.6

Affected Resource(s)

  • digitalocean_droplet

Expected Behavior

Cloud-init should be completed by the time a remote-exec provisioner executes

Actual Behavior

The remote-exec occurs as soon as possible, which is non-ideal

syntaqx avatar Aug 12 '19 18:08 syntaqx

I'd love to be able to support something like this, but it's not clear if that is possible from the side of the DigitalOcean provider. The provider code doesn't handle of the provisioner bits. In pre-0.12, the only thing that the provider side was responsible for was setting the default host for the connection:

https://github.com/terraform-providers/terraform-provider-digitalocean/blob/master/digitalocean/resource_digitalocean_droplet.go#L351

In the post-0.12 world, this needs to be done explicitly:

https://www.terraform.io/upgrade-guides/0-12.html#default-settings-in-connection-blocks

I'll investigate this some more, but I think it likely requires a more general solution. Related issue in Terraform core: https://github.com/hashicorp/terraform/issues/4668

andrewsomething avatar Aug 12 '19 18:08 andrewsomething

My thoughts are that it would be possible by simply making it an inclusion default (ie, basically moving the wait script to the provider rather than a module or in every droplet). Not sure if that's a good terraform practice or not, but if it's not possible through the API that would be a workaround. Would love to see what you come up with though.

syntaqx avatar Aug 12 '19 19:08 syntaqx

I've personally run into this a few times and use a very similar "wait" approach.

I think the big issue with handling this at the provider level is that providers have no concept of "connections" to the resources, i.e. SSH in this case. So to build this in we'd have to cross multiple lines.

I wonder how other cloud providers handle lifecycle events around cloud-init.

Think this is a good one for us to raise with the core team next month at HashiConf.

cc @joatmon08

eddiezane avatar Aug 14 '19 01:08 eddiezane

I'd love to be able to support something like this, but it's not clear if that is possible from the side of the DigitalOcean provider.

The provider could support it if the DO API supported a way for the droplet's initialization code to provide upward feedback through the API. Specifically, the Droplet Metadata API could support a "status update" endpoint (e.g., POST to http://169.254.169.254/status) with a string that is then returned from the Droplet API in the DO API (e.g., as init_status). Then the provider can just wait to run provisioners until after the droplet is running and the init_status is updated to some configurable value.

For security, the init_status should only be writable for the first N minutes after droplet boot, or even should be writable once and "latch" to that final value.

tdyas avatar Mar 16 '20 02:03 tdyas