terraform-provider-vsphere icon indicating copy to clipboard operation
terraform-provider-vsphere copied to clipboard

Disk provisioning settings are not correctly applied based on provided arguments

Open melck opened this issue 5 years ago • 18 comments
trafficstars

Hello,

We create virtual machine with thick provisionning and LZT on nfs datastore (netapp),. At the end of creation, vSphere with datastore plugins (netapp) change eagerly_scrub from false to true automaticly.

At the next plan of the project, we have a error because of that. We have tested to ignore results

Terraform Version

Terraform v0.12.24
+ provider.template v2.1.2
+ provider.vsphere v1.17.1

Affected Resource(s)

  • vsphere_virtual_machine

Expected Behavior

Ignore change of eagerly_scrub parameter or provide a way to ignore this parameters on all disks.

Actual Behavior

On next plan, we have a error on disk who changes from LZT to EZT.

Important Factoids

We are force to do thick provisonning ...

List of lifecycle ignore_changes, we tried :

lifecycle {
    ignore_changes = [
      disk["eagerly_scrub"],
    ]
}

We tried (seems not implemented) :

lifecycle {
    ignore_changes = [
      disk["all.eagerly_scrub"],
    ]
}
lifecycle {
    ignore_changes = [
      disk["*.eagerly_scrub"],
    ]
}

With index :

lifecycle {
    ignore_changes = [
      disk["0.eagerly_scrub"],
      disk["1.eagerly_scrub"],
    ]
}

With label :

lifecycle {
    ignore_changes = [
      disk["disk0.eagerly_scrub"],
      disk["disk1.eagerly_scrub"],
    ]
}

The only one who works :

lifecycle {
    ignore_changes = [
      disk,
    ]
}

melck avatar Apr 09 '20 11:04 melck

@melck are you getting an error like:

ERROR Error: Provider produced inconsistent final plan 
ERROR                                              
ERROR When expanding the plan for module.master.vsphere_virtual_machine.vm[1] to 
ERROR include new values learned so far during apply, provider  
ERROR "registry.terraform.io/-/vsphere" produced an invalid new value for 
ERROR .disk[0].eagerly_scrub: was cty.False, but now cty.True.  

jcpowermac avatar Apr 21 '20 20:04 jcpowermac

@jcpowermac I no longer have the error message of the plan, but I kept the message of the application phase :

Error: error reconfiguring virtual machine: error processing disk changes post-clone: disk.0: cannot change the value of "eagerly_scrub" - (old: true new: false)

It's seems similar and logic.

melck avatar Apr 22 '20 08:04 melck

@jcpowermac I forgot to mention that we use dynamic block on disk with map containing disks parameters of vm.

melck avatar Apr 22 '20 08:04 melck

@jcpowermac I found exact message on plan :

Error: disk.0: virtual disk "disk0": cannot change the value of "eagerly_scrub" - (old: true new: false)

melck avatar Apr 22 '20 09:04 melck

Any news ?

melck avatar May 05 '20 13:05 melck

@melck we are debugging. Our issue looks to be related to the planning stage. The variables for eager and thin are set to false when apply runs thin is true.

jcpowermac avatar May 05 '20 13:05 jcpowermac

@melck We are experiencing similar issues but with different storage providers. In our environment we are using vSAN but have a storage policy that is set to thin. It seems if you change the disk type underneath the provider it does not like that change.

terraform plan -out=tfplan
terraform show -json tfplan > plan.json
                        "disk": [
                            {
                                "attach": false,
                                "datastore_id": "<computed>",
                                "disk_mode": "persistent",
                                "disk_sharing": "sharingNone",
                                "eagerly_scrub": false,
                                "io_limit": -1,
                                "io_reservation": 0,
                                "io_share_count": 0,
                                "io_share_level": "normal",
                                "keep_on_remove": false,
                                "key": 0,
                                "label": "disk0",
                                "name": null,
                                "size": 16,
                                "storage_policy_id": null,
                                "thin_provisioned": false,
                                "unit_number": 0,
                                "write_through": false
                            }
                        ],

jcpowermac avatar May 05 '20 16:05 jcpowermac

To add to what @jcpowermac reported our Terraform plan is producing a plan inconsistent with the values found during apply. The plan shows the VM will be thick-provisioned, but the vSAN storage policy enforces a thin-provision.

These docs provide background over the general problem: https://www.terraform.io/docs/extend/terraform-0.12-compatibility.html#inaccurate-plans

The document points out that the solution is: "If you see either of these errors, the remedy is the same: implement CustomizeDiff for the resource type that is causing the problem, and write logic to more accurately predict the outcome of any changes to Computed attributes."

Interestingly, the vSphere provider recently merged a pull request which implements import OVA functionality. In that pull request they skip CustomizeDiff in the case where they import the OVA: https://github.com/terraform-providers/terraform-provider-vsphere/blob/master/vsphere/resource_vsphere_virtual_machine.go#L904-L907

If we skip his CustomizeDiff our VMs are correctly created.

patrickdillon avatar May 08 '20 15:05 patrickdillon

@aareet @bill-rich When we comment out DiskDiffOperation our cloning succeed without issue.

I am not quite sure the back story on why we would need to create a new subresource map and then normalize that data but it seems to be causing our problems above. Do you have a suggestion of what you would like to see as a resolution?

jcpowermac avatar May 08 '20 18:05 jcpowermac

I'm running into what I think may be a related issue. I'm trying to clone a template and copy the disk information over, similar to the examples on the website:

.tf file:

data "vsphere_virtual_machine" "server_template" {
    name            = var.server_template
    datacenter_id   = data.vsphere_datacenter.dc.id
}

resource "vsphere_virtual_machine" "domaincontroller_1" {
    num_cpus    = 2
    memory      = 4096
    guest_id    = data.vsphere_virtual_machine.server_template.guest_id
    scsi_type   = data.vsphere_virtual_machine.server_template.scsi_type
    firmware    = data.vsphere_virtual_machine.server_template.firmware

    disk {
        label               = "disk0"
        size                = var.server_disk_size
        eagerly_scrub       = data.vsphere_virtual_machine.server_template.disks.0.eagerly_scrub
        thin_provisioned    = data.vsphere_virtual_machine.server_template.disks.0.thin_provisioned
    }

Note that I am specifying var.server_disk_size explicitly, because using data.vsphere_virtual_machine.server_template.disks.0.size caused the "size not specified" error. I believe these are all related issues with information not being carried over from the data source.

Relevant output from terraform apply:

  # data.vsphere_virtual_machine.server_template will be read during apply
  # (config refers to values not yet known)
 <= data "vsphere_virtual_machine" "server_template"  {
      + alternate_guest_name    = (known after apply)
      + datacenter_id           = (known after apply)
      + disks                   = (known after apply)
      + firmware                = (known after apply)
      + guest_id                = (known after apply)
      + guest_ip_addresses      = (known after apply)
      + id                      = (known after apply)
      + name                    = "winserver_packer_image"
      + network_interface_types = (known after apply)
      + scsi_bus_sharing        = (known after apply)
      + scsi_type               = (known after apply)
    }
...
          + disk {
          + attach           = false
          + datastore_id     = "<computed>"
          + device_address   = (known after apply)
          + disk_mode        = "persistent"
          + disk_sharing     = "sharingNone"
          + eagerly_scrub    = false
          + io_limit         = -1
          + io_reservation   = 0
          + io_share_count   = 0
          + io_share_level   = "normal"
          + keep_on_remove   = false
          + key              = 0
          + label            = "disk0"
          + path             = (known after apply)
          + size             = 32
          + thin_provisioned = false
          + unit_number      = 0
          + uuid             = (known after apply)
          + write_through    = false
        }

So it seems that the source information isn't known until after apply, but when specifying information for the new disk, even though TF should be getting that information from the template, TF is setting values for it explicitly.

docandrew avatar May 11 '20 16:05 docandrew

I don't understand well how you want to fix that. If you fix inconsistant plan by removing customize diff, how the next plan will behave ?

From my point of view, we don't know in advance what storage policies will be applied and it's faster for my case to create with LZT and let storage sdk (or vsan) change disk type when finish provisioning it.

Ideally we would want to keep feature of disk handling (creation of new ressource when size change) without having to handle disk type after first provisioning.

It seems nested ignore_change on collection are not implemented in terraform 0.12. What do you propose to handle this ?

melck avatar Jun 02 '20 08:06 melck

Any news ?

melck avatar Jun 23 '20 07:06 melck

I was seeing cloning failures in the DiskPostCloneOperation so I opened #1161. This resolved the issue but I'm still vetting the side-effects of doing this. My tests were limited to the thin_provisioned item, but I was able to successfully clone a VM when the provisioning type silently changed under the hood.

My tests also included #1075.

mtnbikenc avatar Aug 07 '20 17:08 mtnbikenc

This is still an issue with current versions.

github.com/hashicorp/terraform-provider-vsphere v1.24.3

msg=Error: Provider produced inconsistent final plan

msg=When expanding the plan for module.bootstrap.vsphere_virtual_machine.vm to
msg=include new values learned so far during apply, provider
msg="registry.terraform.io/-/vsphere" produced an invalid new value for
msg=.disk[0].eagerly_scrub: was cty.False, but now cty.True.

msg=This is a bug in the provider, which should be reported in the provider's own
msg=issue tracker.

mtnbikenc avatar Feb 12 '21 20:02 mtnbikenc

Hi, I also have problems here even when I'd set ignore_changes block in the lifecycle.

zjheyvc avatar Jul 12 '21 16:07 zjheyvc

Just updated to the latest release of the provider and this seems to be no longer an issue, can anyone else confirm?

jcpowermac avatar Mar 10 '22 19:03 jcpowermac

The context is duplicated in #1303. Planning to close #1303 in favor of this issue.

TL;DR: The disk provisioning settings are not being applied based on provided arguments given to eagerly_scrubbed and thin_provisioned for a clone.

When cloning a virtual machine, the virtual machine only retains the settings which were originally provisioned.

  • thin >> thin
  • lazy zero >> lazy zero
  • eager zero >> eager zero

Ryan Johnson Staff II Solutions Architect | VMware, Inc.

tenthirtyam avatar Mar 21 '22 21:03 tenthirtyam

Updating the description to "Disk provisioning settings are not correctly applied based on provided arguments."

Ryan Johnson Staff II Solutions Architect | VMware, Inc.

tenthirtyam avatar Mar 21 '22 21:03 tenthirtyam

Hi @jcpowermac and @tenthirtyam , I also tried to repro this issue and found everything works well...not sure if the steps bellow are correct, please help review:

  1. Manually create a VM with thick provision from UI without eagerly_scrub settings.
  2. Terraform apply with following content in .tf file
data "vsphere_virtual_machine" "template" {
  name          = "thick-repro"
  datacenter_id = data.vsphere_datacenter.datacenter.id
}

resource "vsphere_virtual_machine" "main" {
  name             = "main"
  resource_pool_id = data.vsphere_compute_cluster.cluster.resource_pool_id
  guest_id         = data.vsphere_virtual_machine.template.guest_id
  network_interface {
    network_id   = data.vsphere_network.network.id
    adapter_type = data.vsphere_virtual_machine.template.network_interface_types[0]
  }
  datastore_id     = data.vsphere_datastore.datastore.id

  num_cpus = 1
  memory   = 1024

  scsi_type = data.vsphere_virtual_machine.template.scsi_type

  disk {
    label            = "disk0"
    size             = data.vsphere_virtual_machine.template.disks.0.size
    eagerly_scrub    = true
    thin_provisioned = false
    }

  clone {
    template_uuid = data.vsphere_virtual_machine.template.id
  }
}

Behavior: VM was created successfully with eagerly_scrub = true and thin_provisioned = false, and checked tfstate file both VM template "thick-repro" and new VM "main" are in eagerly_scrub = true and thin_provisioned = false. I think it's expected, right? I mainly referred to test steps in https://github.com/hashicorp/terraform-provider-vsphere/issues/1303, but looks like the behavior is different from then.

Just updated to the latest release of the provider and this seems to be no longer an issue, can anyone else confirm?

zxinyu08 avatar Dec 21 '22 10:12 zxinyu08

@zxinyu08 we started using this repo instead of our fork early this year. The only issue seems to be on delete of a machine: https://github.com/openshift/installer/issues/5830 The resolution was here: https://github.com/openshift/installer/pull/5848/files

So while it was resolved for us I am assuming the problem still exists.

jcpowermac avatar Dec 21 '22 13:12 jcpowermac

@zxinyu08 - I can retest in January '23, but last tests showed that the issue was still present depending on the source template.

May I ask if you tested on VMFS, too?

tenthirtyam avatar Dec 21 '22 16:12 tenthirtyam

Thank you both! I tested on vSAN datastore. @tenthirtyam

zxinyu08 avatar Dec 22 '22 04:12 zxinyu08

Good to know! My tests were on VMFS with different results.

tenthirtyam avatar Dec 22 '22 05:12 tenthirtyam

@tenthirtyam Hi do you have any idea how to solve the problem? How can we help?

melck avatar Mar 15 '23 13:03 melck

This functionality has been released in v2.5.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

github-actions[bot] avatar Oct 09 '23 20:10 github-actions[bot]

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Nov 09 '23 02:11 github-actions[bot]