terraform-provider-proxmox icon indicating copy to clipboard operation
terraform-provider-proxmox copied to clipboard

Adding disk causes VM to be re-created

Open otopetrik opened this issue 3 years ago • 6 comments

Describe the bug Adding additional disk block (in addition to the existing virtio0):

disk {
  interface = "virtio1"
  size = 10
}

causes VM to be re-created. That also means losing data on existing disks...

To Reproduce Steps to reproduce the behavior:

  1. add disk block to existing (testing) VM
  2. see VM get destroyed and created again

Expected behavior Adding disk would not destroy existing VM and/or cause data loss. Adding disk would not even stop/reboot the VM

Screenshots

...
      }

      ~ cpu {
          - flags        = [] -> null
            # (6 unchanged attributes hidden)
        }

      ~ disk { # forces replacement
            # (4 unchanged attributes hidden)
        }
      + disk { # forces replacement
          + datastore_id = "local-zfs"
          + file_format  = "qcow2" # forces replacement
          + interface    = "virtio1"
          + size         = 10
        }
      ~ initialization {

...

Additional context Looks like this is wrong: https://github.com/bpg/terraform-provider-proxmox/blob/b26e737257d76e29f57a4bd8d51e955cd1dce9e9/proxmoxtf/resource_virtual_environment_vm.go#L474

Proxmox web-interface supports adding disks to running VMs when disk hotplug is enabled.

Disk hotplug detection seems simple: does the list veClient.GetVM(ctx, nodeName, vmID).Hotplug contains string "hotplug" ?

The following command:

curl -v --insecure \
--header 'Authorization: PVEAPIToken=root@pam!mycurltoken=...uuid...' \
--header "Content-Type: application/json" \
--request POST --data '{"virtio1": "local-zfs:20"}' \
https://test01.example.com:8006/api2/json/nodes/test01/qemu/1234/config

successfully adds a disk to running VM.

Looks like current POST call to '/config' implementation cannot be reused beacuse of the types: https://github.com/bpg/terraform-provider-proxmox/blob/b26e737257d76e29f57a4bd8d51e955cd1dce9e9/proxmox/virtual_environment_vm.go#L370-L383

But something like the following might work:

type NewStorageDevice struct {
  PoolID *string
  Size *int
}
type VirtualEnvironmentVMCreateVMDiskRequestData struct {
SATADevice0           *NewStorageDevice          `json:"sata00,omitempty"`
...
VirtualIODevice0     *NewStorageDevice          `json:"virtio0,omitempty"`
...
VirtualIODevice15     *NewStorageDevice          `json:"virtio15,omitempty"`
}

func (c *VirtualEnvironmentClient) CreateVMDisk(ctx context.Context, nodeName string, vmID int, d *VirtualEnvironmentVMCreateVMDiskRequestData) (*string, error) {
        resBody := &VirtualEnvironmentVMUpdateAsyncResponseBody{}
        err := c.DoRequest(ctx, hmPOST, fmt.Sprintf("nodes/%s/qemu/%d/config", url.PathEscape(nodeName), vmID), d, resBody)

        if err != nil {
                return nil, err
        }

        if resBody.Data == nil {
                return nil, errors.New("the server did not include a data object in the response")
        }

        return resBody.Data, nil

}
func (r NewStorageDevice) EncodeValues(key string, v *url.Values) error {
        v.Add(key, fmt.Sprintf("%s:%d", *r.PoolID, *r.Size);
        return nil
}

(It is just a guess, I'm not used to Go language)

veClient.CreateVMDisk could then be called from resourceVirtualEnvironmentVMUpdateDiskLocationAndSize like existing disk resize and disk move operations.

otopetrik avatar Jul 27 '22 18:07 otopetrik

Awesomely detailed bug report, thanks!

bpg avatar Jul 29 '22 01:07 bpg

Marking this issue as stale due to inactivity in the past 180 days. This helps us focus on the active issues. If this issue is reproducible with the latest version of the provider, please comment. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

github-actions[bot] avatar Dec 01 '23 00:12 github-actions[bot]

still valid

bpg avatar Dec 01 '23 02:12 bpg

Hello,

I've encountered similar issue with the provider when removing disks from a template.

Steps to Reproduce:

  1. Initial Configuration: Add two disks to the template as follows:
    # Disk 1: Boot
    disk {
      datastore_id = "fast"
      discard = "on"
      file_format = "raw"
      file_id = "shared-iso:iso/ubuntu-22.04-cloudimg-amd64-raw.img"
      interface = "scsi0"
      iothread = false
      size = 20
      ssd = true
    }
    # Disk 2: Data
    disk {
      cache = "none"
      datastore_id = "fast"
      discard = "on"
      file_format = "raw"
      interface = "scsi1"
      iothread = false
      size = 15
      ssd = true
    }``` 
    
    
  2. Removing a Disk: Attempting to remove a disk (e.g., the second disk) results in the template trying to recreate itself instead of just removing the specified disk. (See attached screenshot:

image

Expected Behavior: The template should allow adding or removing a specified disk without attempting to recreate the entire VM.

I'm unsure whether to open a new issue for this or if it fits within the scope of the current one. Additionally, I'm willing to assist in resolving this issue; any guidance on contributing would be greatly appreciated.

Thank you

zarinbal avatar Dec 31 '23 13:12 zarinbal

Thanks for confirming, @zarinbal, and for offering help!

The disk management part in the provider is quite messy, especially when cloning templates. There are few open tickets for disks, and they are related to each other. On top of that there are dozens of use cases that provider needs to support, so we need to be careful making changes there. I really need to set some time aside and think about reorganizing this part...

But please vote with 👍🏼 on the ticket to rise its priority.

bpg avatar Jan 03 '24 03:01 bpg

Same problem

i tried to add or remove disk

resource "proxmox_virtual_environment_vm" "k8s_master_1" {
  vm_id               = 120
  name                = "k8s-master-1"
  node_name           = "prxmx"
  acpi                = true
  bios                = "seabios"
  boot_order          = null
  description         = null
  migrate             = false
  on_boot             = true
  reboot              = false
  scsi_hardware       = "virtio-scsi-pci"
  started             = true
  stop_on_destroy     = null
  tablet_device       = true
  tags                = []
  template            = false
  lifecycle {
   # prevent_destroy = true
  }
  cpu {
    architecture = "x86_64"
    cores        = 4
    flags        = []
    hotplugged   = 0
    limit        = 0
    numa         = true
    sockets      = 1
    type         = "host"
    units        = 1024
  }
  clone {
    vm_id = 1001
  }
  agent {
    enabled = true
    timeout = "15m"
    trim    = false
    type    = "virtio"
  }
  disk {
    cache             = "none"
    datastore_id      = "local-lvm"
    discard           = "ignore"
    file_format       = "raw"
    interface         = "scsi0"
    iothread          = false
    size              = 25
    ssd               = false
  }
  disk {
    cache             = "none"
    datastore_id      = "local-lvm"
    discard           = "ignore"
    file_format       = "raw"
    interface         = "scsi1"
    iothread          = false
    size              = 20
    ssd               = false
  }
  initialization {
    datastore_id         = "local-lvm"
    interface            = "ide2"
    ip_config {
      ipv4 {
        address = "10.0.0.120/24"
        gateway = "10.0.0.1"
      }
    }
  }
  memory {
    dedicated = 2048
    floating  = 0
    shared    = 0
  }
  network_device {
    bridge      = "vmbr0"
    enabled     = true
    firewall    = false
    model       = "virtio"
    mtu         = 0
    queues      = 0
    rate_limit  = 0
    vlan_id     = 0
  }
  serial_device {
    device = "socket"
  }
}

and then terraform tries to delete and recreate the VM.

image

i use latest version

version = "0.44.0"

Gizex avatar Jan 21 '24 20:01 Gizex

I concur this is a high-priority issue, specially if you use proxmox to deploy k8s clusters and needs additional dedicated storage drive. Loosing all cluster join and k8s configuration is very inconvenient.

With hot-plug-enabled drive, proxmox does not require a VM reboot to add/remove drive. But rebooting is not an issue per-se and might also be an interim solution if it helps (although I presume it wouldn't make the resolution easier).

A workaround to add disk without destroying the VM until this issue is fixed is:

  • Manually add the disk via proxmox GUI / CLI - which should be hot-plugged added to each VM
  • Add the disk in your .tf configuration
  • Manually ad. the disk in your .tfstate file for each VM state - if you are unsure about the properties/format you can always use terraform import to import the machine to a test configuration and check the correct syntax.

At this point running terraform plan should result in 'No changes'.

Note: do not replace the entire VM config via import as the full VM config doesn't seem to track 1:1 the complete .tfstate - this might require an additional issue.

I understand this is not an elegant approach but seems to be consistent and avoided me the recreation of a working cluster.

I hope this issue gets resolved asap. Is there any update about this?

alex-ioma avatar May 28 '24 01:05 alex-ioma

I might be able to make some improvements for non-clone use cases for the next release.

bpg avatar May 29 '24 05:05 bpg