terraform-provider-proxmox icon indicating copy to clipboard operation
terraform-provider-proxmox copied to clipboard

Removing cloud-init from VM or update env file causes VM replacement

Open perryflynn opened this issue 1 year ago • 2 comments

Describe the bug

After the VM is bootstrapped, I would like to remove the cloud-init drive from it. Another scenario is to update the user_data_file, which also cases a force replacement of the VM.

Both because user_data_file_id is touched.

To Reproduce

Steps to reproduce the behavior:

  1. Create a resource proxmox_virtual_environment_file and proxmox_virtual_environment_vm which references the env file in initialization.user_data_file_id
  2. Remove initialization {} or update proxmox_virtual_environment_file content
  3. See VM force replacement

Please also provide a minimal Terraform configuration that reproduces the issue.

terraform {
  required_version = ">= 1.0"
  required_providers {
    proxmox = {
      source = "bpg/proxmox"
      version = "0.66.3"
    }
  }
}

provider "proxmox" {
  endpoint = "https://benny.mm.example.com:8006"
  username = "root@pam"
  #password = "" # use env PROXMOX_VE_PASSWORD
}

resource "proxmox_virtual_environment_file" "cloud_config" {
  content_type = "snippets"
  datastore_id = var.pve_snippetstore
  node_name    = var.pve_node

  source_raw {
    data = <<-EOF
    #cloud-config
    packages: []

    keyboard:
      layout: de

    locale: en_US.UTF-8

    timezone: Europe/Berlin
    users: []
    groups: []
    EOF

    file_name = "cloud-config.yaml"
  }
}

resource "proxmox_virtual_environment_vm" "debidesk" {
    vm_id = 104
    name = "debidesk.example.com"
    description = "Remote Desktop based on Debian"
    tags = [ "desktop" ]
    node_name = var.pve_node

    on_boot = false
    boot_order = [ "scsi0", "ide2" ]

    initialization {
      interface = "ide0"
      datastore_id = var.pve_blockstore

      ip_config {
        ipv4 {
          address = "dhcp"
        }
      }

      user_data_file_id = proxmox_virtual_environment_file.cloud_config.id
    }

    agent {
        enabled = true
        timeout = "10s"
    }

    startup {
      order = 200
    }

    operating_system {
      type = "l26"
    }

    cpu {
      type = "x86-64-v3"
      sockets = 1
      cores = 4
    }

    memory {
      dedicated = 2048
    }

    vga {
      type = "qxl"
      memory = 32
    }

    network_device {
      model = "virtio"
      bridge = "vmbr42"
      firewall = true
      enabled = true
    }

    scsi_hardware = "virtio-scsi-single"

    disk {
      interface = "scsi0"
      aio = "io_uring"
      backup = true
      cache = "none"
      discard = "on"
      ssd = true
      iothread = true
      size = 64
      datastore_id = var.pve_blockstore
      file_format = "raw"
    }
}

and the output of terraform|tofu apply.

proxmox_virtual_environment_file.cloud_config: Refreshing state... [id=pve-manual-beta:snippets/cloud-config.yaml]
proxmox_virtual_environment_vm.debidesk: Refreshing state... [id=104]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # proxmox_virtual_environment_file.cloud_config must be replaced
-/+ resource "proxmox_virtual_environment_file" "cloud_config" {
      + file_modification_date = (known after apply)
      ~ file_name              = "cloud-config.yaml" -> (known after apply)
      + file_size              = (known after apply)
      + file_tag               = (known after apply)
      ~ id                     = "pve-manual-beta:snippets/cloud-config.yaml" -> (known after apply)
        # (5 unchanged attributes hidden)

      ~ source_raw {
          ~ data      = <<-EOT # forces replacement
                #cloud-config
                packages: []
                
              + # foo
            EOT
            # (2 unchanged attributes hidden)
        }
    }

  # proxmox_virtual_environment_vm.debidesk must be replaced
-/+ resource "proxmox_virtual_environment_vm" "debidesk" {
      ~ id                      = "104" -> (known after apply)
      ~ ipv4_addresses          = [
          - [
              - "127.0.0.1",
            ],
          - [
              - "192.168.42.65",
            ],
        ] -> (known after apply)
      ~ ipv6_addresses          = [
          - [
              - "::1",
            ],
          - [
              - "fe80::be24:11ff:fe14:e3c8",
            ],
        ] -> (known after apply)
      ~ mac_addresses           = [
          - "00:00:00:00:00:00",
          - "BC:24:11:14:E3:C8",
        ] -> (known after apply)
        name                    = "debidesk"
      ~ network_interface_names = [
          - "lo",
          - "ens18",
        ] -> (known after apply)
        tags                    = [
            "desktop",
        ]
        # (26 unchanged attributes hidden)

      - cdrom {
          - enabled   = true -> null
          - file_id   = "pve-manual-beta:iso/perrys-bootstrapper-2024.11.13-x86_64.iso" -> null
          - interface = "ide2" -> null
        }

      ~ cpu {
          - flags        = [] -> null
            # (9 unchanged attributes hidden)
        }

      ~ disk {
          ~ path_in_datastore = "vm-104-disk-0" -> (known after apply)
            # (13 unchanged attributes hidden)
        }
      ~ disk {
          ~ path_in_datastore = "vm-104-disk-1" -> (known after apply)
            # (13 unchanged attributes hidden)
        }

      ~ initialization {
          ~ upgrade              = false -> (known after apply)
          ~ user_data_file_id    = "pve-manual-beta:snippets/cloud-config.yaml" -> (known after apply) # forces replacement
            # (6 unchanged attributes hidden)

            # (1 unchanged block hidden)
        }

      ~ network_device {
          - disconnected = false -> null
          ~ mac_address  = "BC:24:11:14:E3:C8" -> (known after apply)
            # (9 unchanged attributes hidden)
        }

        # (5 unchanged blocks hidden)
    }

Plan: 2 to add, 0 to change, 2 to destroy.

Expected behavior

cloud-init device is removed without replacing VM or cloud-init config is updated without replacing VM.

Additional context Add any other context about the problem here.

  • Single or clustered Proxmox: clustered, 3 nodes
  • Proxmox version: 8.2.7
  • Provider version (ideally it should be the latest version): 0.66.3
  • Terraform/OpenTofu version: terraform v1.9.8
  • OS (where you run Terraform/OpenTofu from): Ubuntu Jammy

perryflynn avatar Nov 13 '24 21:11 perryflynn

+1

CRASH-Tech avatar Jan 21 '25 12:01 CRASH-Tech

It's standard among Terraform providers to recreate a VM or instance when user data changes. To work around this, use ignore_changes in a lifecycle block to tell Terraform not to update your infrastructure when it sees that the cloud init has changed.

As for removing the cloud init drive completely, I'm less clear on the right approach here. On the one hand it's a user data change which should recreate the VM, but on the other hand, it's a drive and lots of drive operations can happen without recreating the VM. But given that the initialization block is more than just a drive, I'm of the opinion that removing that block completely signals a change to the initial state of that VM and recreating the VM is appropriate and consistent with the behavior of other Terraform providers.

If wrapping the whole initialization block in an ignore_changes block isn't sufficient and you really want to remove the cloud init drive, you could try using Terraform's removed block to remove it from state, and then remove the drive from Proxmox via some other means.

elias314 avatar Jan 21 '25 18:01 elias314

Changes to user_account block in my opinion should not recreate the vm or container, this is an operation that often needs to be done in my case to update access to virtual machines and lxc containers running in the infrastructure.

Currently what I do is: update manually the public keys in the UI, the run apply from Terraform for sanity and consistency. If instead I tried to apply the change form Terraform directly I would recreate the vm/container in question.

I understand the perspective from which you come from, saying that this is an "initialization" block, but in my opinion all the changes that are non-destructive through the proxmox ui/cli should also be non destructive in terraform with your provider, therefore i would not expect the changes made to the user_info to destroy the container/vm.

bartei avatar Apr 02 '25 23:04 bartei

I understand the perspective from which you come from, saying that this is an "initialization" block, but in my opinion all the changes that are non-destructive through the proxmox ui/cli should also be non destructive in terraform with your provider, therefore i would not expect the changes made to the user_info to destroy the container/vm.

That's a valid point, and I guess it's the main source of confusion.

Perhaps allowing the changes but showing a warning message explaining how they affect a VM/Container, with optional attribute(s) to manage the restart behaviour, would be a good middle ground here.

I'll give it a good though when I get to this ticket in my never-ending-about-to-explode TODO list 😅

bpg avatar Apr 03 '25 15:04 bpg

It is also always possible to control the recreation behaviour the other way around. With lifecycle.replace_triggered_by. So if not forcing a replacement would be the default, it could be enabled with build-in terraform features.

Being able to force a restart when cloud-init config is changed would be nice.

(Sorry for not answering, I missed the reply in January)

perryflynn avatar Apr 03 '25 17:04 perryflynn

I was poking around this issue in #1885 and settled on the following approach (for VMs only, containers are out of scope for now):

  • Any changes in initalization.user_account can be applied without VM re-creation (or even a restart, if using the reboot_after_update flag). See #1885.
  • Changes to any of the *_data_file attributes will cause VM recreation. The main reason is that it's hard to apply them otherwise, and conditionally forcing resource re-creation is really tricky.
  • Removal of the initalization block will cause VM recreation. See this comment, and also keep in mind that TF/Tofu is a declarative IaC system. The described use case just doesn't fit here: you can't forget and ignore the intermediate state (i.e., configuring a user via cloud-init) but still rely on the effects of that state afterward (i.e., using the created user). Use lifecycle settings if you wan't to tweak this behaviour.

Based on all the above, I'm going to close this thicket as "won't fix". But if you feel strongly about it, just comment here or open a new discussion.

bpg avatar Apr 07 '25 20:04 bpg

The approach you describe is perfect, if user_data file is changed I agree that the VM should be recreated on the basis that it's part of the provisioning and changing it after the fact would leave the VM in an unknown/uncontrolled state.

Having the ability to reconfigure the user is the most important aspect for my use case and your described approach would work perfectly.

I'll test your changes today regarding the user accounts configuration and report back if I notice any issues.

Regarding containers, I don't think there is anything that can be done, as the provisioning of the default user account is done in Proxmox only at creation time and it's not configurable after the fact. This implies the inability of making changes from Terraform as well.

Lastly given the broad scope of the initialization block I think you're on the right track by triggering a resource recreation in case of removal.

bartei avatar Apr 07 '25 21:04 bartei