terraform-provider-triton icon indicating copy to clipboard operation
terraform-provider-triton copied to clipboard

Timeout during triton_machine resource creation with v0.5.0 and tags

Open nimajalali opened this issue 7 years ago • 27 comments

Terraform Version

Terraform v0.11.7
+ provider.triton v0.5.0

Affected Resource(s)

  • triton_machine

Terraform Configuration Files

provider "triton" {
  version = "> 0.4.2"

  account      = "njalali"
  key_id       = "68:9f:9a:c4:76:3a:f4:62:77:47:3e:47:d4:34:4a:b7"
  url          = "https://us-east-1.api.joyent.com"
  key_material = "/Users/nimajalali/.ssh/id_rsa"
}

data "triton_image" "test" {
  name    = "ubuntu-certified-16.04"
  version = "20180222"
}

resource "triton_machine" "test" {
  name    = "test"
  package = "k4-highcpu-kvm-1.75G"
  image   = "${data.triton_image.test.id}"

  tags = {
    role = "test"
  }
}

Debug Output

https://gist.github.com/nimajalali/88affb1b005bd5afd30ca83038a4ab4b

Panic Output

N/A

Expected Behavior

Triton machine resource should be created and terraform should exit cleanly.

Actual Behavior

Triton machine resource is created and terraform times out.

* triton_machine.test: timeout while waiting for state to become '12299993088738989455' (last state: '13796914315545672180', timeout: 10m0s)

Steps to Reproduce

  1. terraform apply

Important Factoids

  • The same configuration without triton_machine tags doesn't timeout on 0.5.0.
provider "triton" {
  version = "> 0.4.2"

  account      = "njalali"
  key_id       = "68:9f:9a:c4:76:3a:f4:62:77:47:3e:47:d4:34:4a:b7"
  url          = "https://us-east-1.api.joyent.com"
  key_material = "/Users/nimajalali/.ssh/id_rsa"
}

data "triton_image" "test" {
  name    = "ubuntu-certified-16.04"
  version = "20180222"
}

resource "triton_machine" "test" {
  name    = "test"
  package = "k4-highcpu-kvm-1.75G"
  image   = "${data.triton_image.test.id}"
}

  • The same configuration with version <= 0.4.2 doesn't timeout.
provider "triton" {
  version = "<= 0.4.2"

  account      = "njalali"
  key_id       = "68:9f:9a:c4:76:3a:f4:62:77:47:3e:47:d4:34:4a:b7"
  url          = "https://us-east-1.api.joyent.com"
  key_material = "/Users/nimajalali/.ssh/id_rsa"
}

data "triton_image" "test" {
  name    = "ubuntu-certified-16.04"
  version = "20180222"
}

resource "triton_machine" "test" {
  name    = "test"
  package = "k4-highcpu-kvm-1.75G"
  image   = "${data.triton_image.test.id}"

  tags = {
    role = "test"
  }
}

References

N/A

nimajalali avatar Apr 23 '18 06:04 nimajalali

Hi @nimajalali, thank you for such a wonderfully detailed issue report!

I am sorry that you are having troubles. We are going to have a look and see to see what could be causing this.

kwilczynski avatar Apr 23 '18 09:04 kwilczynski

I had a look, and as per the debug output @nimajalali kindly provided, I can conclude that we are trying very hard to tag a running instance, and that does not work for some reason.

There is a single tag called "role" set to "test" which is due to be applied.

The following line is the debug log represent an entry where we wait for the instance to come about as running:

2018-04-22T22:39:52.299-0700 [DEBUG] plugin.terraform-provider-triton_v0.5.0_x4: 2018/04/22 22:39:52 [DEBUG] Waiting for state to become: [running]

Then, we try to apply a tag, which is shown as the following entry:

2018-04-22T22:40:23.441-0700 [DEBUG] plugin.terraform-provider-triton_v0.5.0_x4: 2018/04/22 22:40:23 [DEBUG] Waiting for state to become: [12299993088738989455]

The state show above as 12299993088738989455 comes from the following:

https://github.com/terraform-providers/terraform-provider-triton/blob/9b674227c55c9bf2624be9a288755b8e874511b9/triton/resource_machine.go#L631-L650

Especially this line:

Target: []string{strconv.FormatUint(expectedTags, 10)},

I need to look more into this, as the logic in the code hasn't changed recently (in fact, it was added quite a while ago), therefore this is either something to do with different version of Triton-Go SDK we use, or the underlying CloudAPI is having some issues.

kwilczynski avatar Apr 23 '18 10:04 kwilczynski

Hi @nimajalali,

I am having a hard time to reproduce this issue. I took your sample template, filled-in my account details, and it does seem to succeed. Therefore, can I ask you to do few things:

  • Remove tags block from the template, and try again (same data centre).
  • Without removing the tags block from the template, try different data centre (e.g., us-sw-1).

If you will still get issue, then we are going to devise different troubleshooting approach - perhaps, a bespoke provider binary with more debug options added.

kwilczynski avatar Apr 23 '18 10:04 kwilczynski

@kwilczynski Answered questions below:

  • Removing the tags block from the template does succeed in us-east-1. Mentioned under Important Factoids section in original issue.
  • Without removing the tags block from the template, a different data center us-sw-1 does NOT work. Timeout still occurs.

I checked on another system and I'm able to reproduce this issue consistently using v0.5.0 and having the tags block.

I'll create a custom provider binary to add more logging and see what we can find.

nimajalali avatar Apr 23 '18 18:04 nimajalali

Hi @nimajalali, thank you for update!

Together with @stack72 we were able to pin-point the problem to a change which went into the 0.5.x (see: https://github.com/terraform-providers/terraform-provider-triton/pull/97) that changed a piece of internal logic to accommodate for new functionality.

We are going to work on a fix and issue a Pull Request very soon. Sorry for any troubles!

kwilczynski avatar Apr 23 '18 19:04 kwilczynski

Hi @nimajalali

Please could you help us out by recording a short recreation of the issue you are seeing - I suggest a tool like https://asciinema.org/

I am currently trying to recreate this issue and cannot actually see what the problem is

Even when I go back to a build that is older than 0.4.2, no code changes have been made in the tagging mechanism so anything that you can do to help us see the issue being recreated, would be amazing here

Thanks

Paul

stack72 avatar Apr 26 '18 13:04 stack72

@nimajalali as an addendum, could you try running the same terraform configuration (including the tags block) in us-east-1 as we know that works without the tags

Paul

stack72 avatar Apr 26 '18 13:04 stack72

Sorry to jump into the middle of this issue, we are experiencing the same issues on our private Triton with the 5.1 provider.

Example of our terraform module:

data "triton_image" "image" {
  name        = "${var.image_name}"
  version     = "${var.image_version}"
  most_recent = true
}

resource "triton_machine" "nodejs-service" {
  count             = "${var.servers}"
  package           = "${var.package}"
  image             = "${data.triton_image.image.id}"
  firewall_enabled  = "${var.firewall}"
  user_data         = "edited out"
  user_script       = "${file("${var.user_script}")}"
  networks          = ["${var.nic0}","${var.nic1}"]
  affinity          = ["firewall_tag!=~${var.env}-${var.app}-${data.triton_image.image.version}"]

  tags {
    firewall_tag = "${var.env}-${var.app}-${data.triton_image.image.version}"
  }

  cns {
    services = ["${var.env}-${var.app}-${data.triton_image.image.version}", "${var.env}-${var.app}${count.index}"]
  }

  lifecycle {
     create_before_destroy = true
  }
}

resource "triton_firewall_rule" "clients-to-nodejs-service" {
  rule    = "FROM any TO tag \"firewall_tag\" = \"${var.env}-${var.app}-${data.triton_image.image.version}\" ALLOW tcp (PORT 80 and PORT 8080 and PORT 8081 and PORT 443)"
  enabled = true
}

pannon avatar Apr 30 '18 05:04 pannon

Hi @pannon, I am sorry you are also affected!

Are you also seeing the same issue without tags added?

Since you have an on-premise Triton deployment the evidence would somewhat point towards an issue with the provider code, albeit we can't seem to be able to reproduce this issue reliably (in other words - there is a problem of some sort, but we are not sure yet what might be causing it as it seem to be a bit tricky to reproduce it reliably in our case), which is why I am asking about the with and w/out tags case.

Would you be willing to try a custom built provider binary which as more (debug) logging added?

kwilczynski avatar Apr 30 '18 10:04 kwilczynski

Hi @kwilczynski, yes indeed, after removing the tags it worked just fine with no timeouts.

Could try a binary, but it has to be for OpenBSD as our jumphosts where we terraform are OpenBSD.

pannon avatar Apr 30 '18 14:04 pannon

Hi @pannon,

I have built a custom binary for you (using master branch), which should be almost the same as the one that is currently released (which would be 0.5.1), as there aren't any significant changes that happened after the release.

The ~binary~ (see message below) would print detailed information about requests and responses sent against either Joyent Public Cloud, or any other private CloudAPI end-point. It might be a bit excessive, thus we need to make sure that no sensitive information appears in the output (in other words, you might need to glance on them and sanitise as needed, or you could encrypt results with my public PGP/GPG key and sent it to me for analysis - but, everything within your company's policies and frameworks, of course).

I am going to add @nimajalali here, as he wished to have a custom binary too.

Let me know how I can help otherwise, and I sincerely apologise for any issues you've been having!

kwilczynski avatar Apr 30 '18 16:04 kwilczynski

No worries, the 0.4.2 provider version is working for us.

I get the following error with the new binary:

Error: Error asking for user input: 1 error(s) occurred:

* provider.triton: fork/exec terraform/affinity-test/.terraform/plugins/openbsd_amd64/terraform-provider-triton: exec format error

pannon avatar Apr 30 '18 17:04 pannon

Hi @pannon,

Ah! Doh! That's my bad. I picked wrong binary by accident (got used to Mac OS/Darwin too much).

Here are the binaries (all are x86_64):

Hope it works this time!

kwilczynski avatar Apr 30 '18 17:04 kwilczynski

Hi @pannon,

I forgot to mention, that for the best result you need to run Terraform with $ TF_LOG=debug terraform apply -auto-approve.

kwilczynski avatar Apr 30 '18 17:04 kwilczynski

Hi @kwilczynski, I have the log although heavily edited. Would like to send it through some private channel.

pannon avatar May 01 '18 10:05 pannon

Hi @pannon,

Feel free to send it to [email protected] and encrypt using my PGP/GPG (see: 0x3DE334E7).

I will have a look and hopefully we can see what is brewing there!

Thank you a bunch for your help with this!

kwilczynski avatar May 01 '18 10:05 kwilczynski

To anyone following this issue (but primarily @nimajalali and @pannon) :-

I have worked together with @pannon, who was kind enough to lend us a hand (we worked outside of the public view so that any potential confidential data is not exposed during troubleshooting, etc.) so that we can do a proper diagnosis and build a small test case allowing us to reproduce the issue.

And we were successful! The following is the minimal test case which allows us to reproduce the problem reliably:

(using Joyent Public Cloud - JPC):

data "triton_image" "test" {
  name        = "ubuntu-certified-16.04"
  most_recent = true
}

resource "triton_machine" "test" {
  package = "k4-highcpu-kvm-1.75G"
  image   = "${data.triton_image.test.id}"

  cns {
    services = [
      "1.2.3"
    ]
  }
}

I will explain the nature of the problem in next message.

kwilczynski avatar May 07 '18 13:05 kwilczynski

At the moment, the triton_machine resource includes functionality that aims to mitigate the delayed/eventually-consistent nature of the CNS service when it returns the newly computed list of all the supported domain names (for use with DNS).

We calculate the difference between old/new and seen domain names, and then devise whether the node has "converged" (to mean "CNS has finished setting up all the domain names, and there were no new arrivals or removals from the list), and if such be the case, then we hail a success and end with a success creation of a compute instance.

The methods which do that are called hasValidDomainNames and hasInitDomainNames.

The issue here was manifesting itself when the CNS service included a dot (.) (one or many) in its name (e.g., example-1.2.3 or simply a.b.c). In the following block of code we have made an assumption that splitting a CNS generated domain name on a single dot would be enough to derive the current prefix of such domain name (as CNS domain names follow a specific pattern) so that it [prefix] can be used laster for tracking/comparison when we wait for CNS to finish adding entries.

(an excerpt showing where the split had place):

https://github.com/terraform-providers/terraform-provider-triton/blob/d85074cf3f6d170df9060ef082cf3d3a31d680c7/triton/resource_machine.go#L999-L1005

Given the the following CNS domain name: a.b.c.svc.c7f788cc-6784-652a-ec98-ac2fab347081.region.example.com; the current implementation would attempt to compare a against the list of known/seen domain names, and it turn it would never return a value of true marking a successful match. The correct prefix here would be a.b.c.

We have since created a fix, and it will be added once the work on the Pull Request #112 is completed.

kwilczynski avatar May 07 '18 14:05 kwilczynski

Update:

After talking with the team internally about how to correctly implement this feature and how to devise a fix for it, it has become more and more apparent that we actually should not allow dots in the service name.

This is because a service name is directly connected to the CNS domain name which is going to be created as a result, and thus as a result we are also limited by what the DNS standard allows (especially when it goes to the way how labels/segments in the fully-qualified domain name are handled).

The restrictions we need to take into consideration going forward:

  • There is no IDN support
  • Each service name has to be lower-case letters, so a to z only
  • Numbers from 0 to 9 allowed
  • Dash - (hyphen) is allowed
  • A service name cannot start with a dash (hyphen)
  • Each service name cannot be longer than 63 characters

Related documentation:

kwilczynski avatar May 10 '18 17:05 kwilczynski

Apologies if I'm distracting a separate issue, but I'm experiencing the triton_machine create timeout behaviour the original issue was describing, and I thought I'd add a commentary of my investigations.

We're running a private Triton environment and I'm creating a simple triton_machine resource along these lines:

resource "triton_machine" "machine_name" {
  name = "${var.name}_value"
  package = "package_name"
  image = "${data.triton_image.image.id}"

  user_script = "${data.local_file.script.content}"

  affinity = ["role!=~the-role"]

  networks = [
    "${data.triton_network.net_a.id}",
    "${data.triton_network.net_b.id}"
  ]

  tags {
    role = "the-role"
    some-tag = "true"
  }

  metadata {
  }
}

Using v0.5.0 onwards of the provider, the VM would successfully create and transition to the running state (after ~40 seconds), but the terraform run would sit and eventually time out. Looking at a TRACE of the terraform run, I saw this happen (I had added an extra line of output to trace the state returned from CloudAPI):

2019-05-30T10:01:35.447+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:35 [DEBUG] Waiting for state to become: [running]
2019-05-30T10:01:35.521+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:35 refresh - id = 68f3dcfe-d044-e306-a333-e293e083b856 state = provisioning
2019-05-30T10:01:35.521+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:35 [TRACE] Waiting 3s before next try
2019-05-30T10:01:38.596+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:38 refresh - id = 68f3dcfe-d044-e306-a333-e293e083b856 state = provisioning
2019-05-30T10:01:38.596+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:38 [TRACE] Waiting 6s before next try
module.mod_name.triton_machine.machine_name: Still creating... (10s elapsed)
2019-05-30T10:01:44.669+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:44 refresh - id = 68f3dcfe-d044-e306-a333-e293e083b856 state = provisioning
2019-05-30T10:01:44.669+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:44 [TRACE] Waiting 10s before next try
module.mod_name.triton_machine.machine_name: Still creating... (20s elapsed)
2019-05-30T10:01:54.752+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:54 refresh - id = 68f3dcfe-d044-e306-a333-e293e083b856 state = provisioning
2019-05-30T10:01:54.752+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:01:54 [TRACE] Waiting 10s before next try
module.mod_name.triton_machine.machine_name: Still creating... (30s elapsed)
2019-05-30T10:02:04.850+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:04 refresh - id = 68f3dcfe-d044-e306-a333-e293e083b856 state = provisioning
2019-05-30T10:02:04.850+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:04 [TRACE] Waiting 10s before next try
module.mod_name.triton_machine.machine_name: Still creating... (40s elapsed)
2019-05-30T10:02:14.923+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:14 refresh - id = 68f3dcfe-d044-e306-a333-e293e083b856 state = provisioning
2019-05-30T10:02:14.923+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:14 [TRACE] Waiting 10s before next try
module.mod_name.triton_machine.machine_name: Still creating... (50s elapsed)
2019-05-30T10:02:25.005+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:25 refresh - id = 68f3dcfe-d044-e306-a333-e293e083b856 state = running
2019-05-30T10:02:25.827+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:25 [DEBUG] Waiting for state to become: [12533489120426878286]
2019-05-30T10:02:26.050+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:26 [TRACE] Waiting 3s before next try
2019-05-30T10:02:29.148+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:29 [TRACE] Waiting 6s before next try
module.mod_name.triton_machine.machine_name: Still creating... (1m0s elapsed)
2019-05-30T10:02:35.314+0100 [DEBUG] plugin.terraform-provider-triton: 2019/05/30 10:02:35 [TRACE] Waiting 10s before next try

Which is interesting - the state-change loop sees the state transition to running and then appears to enter a new state-change loop waiting for the state to become 12533489120426878286. I'm not sure where this comes from but at this point it is stuck.

A quick git-bisect indicated that the issue was in commit 31281527a5b74234318cd2bfb03443cea12bf8a3 which is pretty big - but a glaring change in there was:

@@ -415,8 +421,7 @@ func resourceMachineCreate(d *schema.ResourceData, meta interface{}) error {
                return err
        }
 
-       // refresh state after it provisions
-       return resourceMachineRead(d, meta)
+       return resourceMachineUpdate(d, meta)
 }

which seems to try to perform an update operation after a create operation. When I revert this change, the issue goes away and the create returns successfully once the machine enters the running state.

The commit was adding support for the deletion_protection_enabled flag - is this something that can only be set after the create with an update operation?

I was curious as to which state-change in resourceMachineUpdate was getting stuck so instrumented them all - it was indeed getting stuck on the CNS check https://github.com/terraform-providers/terraform-provider-triton/blob/fbb6649aca726387ee61c3dc00509442ab90b585/triton/resource_machine.go#L627-L651

hasValidDomainNames is returning false - I've narrowed it down to this case:

https://github.com/terraform-providers/terraform-provider-triton/blob/d85074cf3f6d170df9060ef082cf3d3a31d680c7/triton/resource_machine.go#L987

For the time being, I'm working around this by actively disabling cns in the machine specification, adding the following:

  ...
  cns {
    disable = true
  }
  ...

I hope that helps!

joshado avatar May 30 '19 11:05 joshado

Hi @joshado! Thank you for such wonderfully detailed and insightful update. I can really tell you applied yourself to track the issue down - and you, of course, found the root cause too.

The problem is known and the work-around / solution you are also suggesting is one way to solve it, but generally not using dots in the service name would do it too.

Now, I sadly can't promise you any fixes or improvements at this point, as I believe that most of the people who worked on this project are no longer at Joyent, therefore this project is no loner actively maintained. Nonetheless, I am sure if someone would send a Pull Request with a fix, then it might be accepted (can't guarantee if and when a new release of the provider would happen, though).

kwilczynski avatar Jun 01 '19 02:06 kwilczynski

This issue can also happen if CNS was not properly updated with metadata at setup time. Detailed here https://docs.joyent.com/private-cloud/install/cns#tasks

Add necessary metadata for proper operation of CNS; note that this will trigger a brief outage in CloudAPI, so plan accordingly

headnode# sdcadm experimental update-other

If that step was missed even normal tags will cause a failure. Might have been @joshado's case.

sreboot avatar Oct 01 '19 15:10 sreboot

I've taken a look over this issue - but I'm not finding any issues when deploying these examples in Joyent's "us-west-1" or in COAL (laptop). I do have CNS setup and running.

I'm running:

$ terraform version
Terraform v0.12.10
+ provider.triton v0.6.0

My example:

data "triton_image" "lx_ubuntu" {
  name        = "ubuntu-16.04"
  most_recent = true
}

resource "triton_machine" "test" {
  name    = "test"
  package = "${local.package}"
  image   = "${data.triton_image.lx_ubuntu.id}"

  #tags = {
  #  role = "test"
  #}

  cns {
    services = [
      "1.2.3"
    ]
  }
}

I am using the latest Terraform (0.12) and

terraform apply
data.triton_image.lx_ubuntu: Refreshing state...
triton_machine.test: Refreshing state... [id=31a59503-5f46-c1a6-c3ad-d9d0ff94c4ae]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # triton_machine.test will be created
  + resource "triton_machine" "test" {
      + compute_node                = (known after apply)
      + created                     = (known after apply)
      + dataset                     = (known after apply)
      + deletion_protection_enabled = false
      + disk                        = (known after apply)
      + domain_names                = (known after apply)
      + firewall_enabled            = false
      + id                          = (known after apply)
      + image                       = "7b5981c4-1889-11e7-b4c5-3f3bdfc9b88b"
      + ips                         = (known after apply)
      + memory                      = (known after apply)
      + name                        = "test"
      + package                     = "sample-1G"
      + primaryip                   = (known after apply)
      + root_authorized_keys        = (known after apply)
      + type                        = (known after apply)
      + updated                     = (known after apply)

      + cns {
          + services = [
              + "1.2.3",
            ]
        }

      + nic {
          + gateway = (known after apply)
          + ip      = (known after apply)
          + mac     = (known after apply)
          + netmask = (known after apply)
          + network = (known after apply)
          + primary = (known after apply)
          + state   = (known after apply)
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

triton_machine.test: Creating...
triton_machine.test: Still creating... [10s elapsed]
triton_machine.test: Still creating... [20s elapsed]
triton_machine.test: Creation complete after 25s [id=7d0c5f88-5dbc-e9eb-c5eb-e560d923f922]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Can those that are hitting this issue confirm if they have CNS setup in their Triton installation?

twhiteman avatar Dec 03 '19 23:12 twhiteman

Ah, I actually got to the bottom of this one and I, personally, think there is still a bug. We have a working CNS setup, but the user I was using had CNS disabled on their account for some reason. In this case, the provider will wedge up as described. Enabling CNS on the user account resolved it. That said, I believe the provider should be tolerant of this situation, especially if no CNS configuration was actually specified.

joshado avatar Dec 06 '19 10:12 joshado

Hi @twhiteman and @joshado, thank you both for taking the time to troubleshoot this!

@joshado the issue you are seeing is different to the one originally reported by @nimajalali, correct? I just want to confirm, as we might want to move the conversation over to a dedicated issue to track this. Especially, since it appears to be a bigger problem at hand.

I wanted to also reiterate that as much as I am keen to help someone to create a fix, solely me alone I have neither the time nor the resources to deploy Triton to try to debug and fix this problem, like a said before. That is, unless someone is willing to provide access to their deployment, etc.

kwilczynski avatar Dec 06 '19 10:12 kwilczynski

@joshado thanks very much for the extra info - I think I can now reproduce this with the following:

Disabling the CNS feature on the account:

$ triton account update triton_cns_enabled=false

Then trying to enable CNS in the terraform config (with disable = false):

data "triton_image" "lx_ubuntu" {
  name        = "ubuntu-16.04"
  most_recent = true
}

resource "triton_machine" "test" {
  name    = "test"
  package = "${local.package}"
  image   = "${data.triton_image.lx_ubuntu.id}"

  cns {
    disable = false
    services = [
      "1.2.3"
    ]
  }
}

Then I do see that the terraform apply fails - times out after 10 minutes.

This certainly should not occur and I'll need to dig into why this is happening.

twhiteman avatar Dec 10 '19 18:12 twhiteman

This is still an issue. The provisioning is timing out after 10 minutes, if the CNS service has the port number specified, example foo:1234 for SRV record creation.

The actual instance provisioning succeeds and the SRV record is created - it just never reports back as successful/failed.

sreboot avatar Aug 17 '21 07:08 sreboot