terraform-provider-nsxt icon indicating copy to clipboard operation
terraform-provider-nsxt copied to clipboard

"nsxt_policy_vm_tags" for port in segment not working in big environments

Open xander-sh opened this issue 2 years ago • 8 comments

Describe the bug

We have two instance of NSX-T Version 3.2.0.1.0.19232396 In one of them (where number of vm > over 1K) terraform module do not tag port of virtual machine in segments. In other (where number of vm = 200-300 ) terraform module work fine.

terraform {  
  required_providers {  
    nsxt = {  
      source  = "vmware/nsxt"  
      version = "= 3.2.8"  
    }  
  }  
  required_version = ">= 1.0.5"  
}  
provider "nsxt" {
    host                  =  "nsx-host.inside"
    username              = "USER"
    password              = "PASSOWRD"
    allow_unverified_ssl  = true
    max_retries           = 5
    retry_min_delay       = 500
    retry_max_delay       = 5000
    retry_on_status_codes = [429]
}
resource "nsxt_policy_vm_tags" "vm1_tags" {
  instance_id = "5007a902-7820-61fb-2611-ec354b516999"
  port {
     segment_path = "/infra/segments/090d3269-9acc-4eb1-8f4b-f1a04fbd0f6a"
    tag {
      scope = "ncp/cluster"
      tag   =  "k8s-cluster-01"
    }
    tag {
      scope = "ncp/node_name"
      tag   =  "k8s-cluster-01-wrk06"
    }
  }
}

Terraform plan is executed without errors, but in terraform.tfstate tags not show

❯ terraform apply

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # nsxt_policy_vm_tags.vm1_tags will be created
  + resource "nsxt_policy_vm_tags" "vm1_tags" {
      + id          = (known after apply)
      + instance_id = "5007a902-7820-61fb-2611-ec354b516999"

      + port {
          + segment_path = "/infra/segments/090d3269-9acc-4eb1-8f4b-f1a04fbd0f6a"

          + tag {
              + scope = "ncp/cluster"
              + tag   = "k8s-cluster-01"
            }
          + tag {
              + scope = "ncp/node_name"
              + tag   = "k8s-cluster-01-wrk06"
            }
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

nsxt_policy_vm_tags.vm1_tags: Creating...
nsxt_policy_vm_tags.vm1_tags: Still creating... [10s elapsed]
nsxt_policy_vm_tags.vm1_tags: Still creating... [20s elapsed]
nsxt_policy_vm_tags.vm1_tags: Still creating... [30s elapsed]
nsxt_policy_vm_tags.vm1_tags: Still creating... [40s elapsed]
nsxt_policy_vm_tags.vm1_tags: Creation complete after 50s [id=5007a902-7820-61fb-2611-ec354b516999]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
❯ cat terraform.tfstate
{
  "version": 4,
  "terraform_version": "1.0.5",
  "serial": 1,
  "lineage": "85020c4b-ebc0-8f0b-e9f4-f1daab83d903",
  "outputs": {},
  "resources": [
    {
      "mode": "managed",
      "type": "nsxt_policy_vm_tags",
      "name": "vm1_tags",
      "provider": "provider[\"registry.terraform.io/vmware/nsxt\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "id": "5007a902-7820-61fb-2611-ec354b516999",
            "instance_id": "5007a902-7820-61fb-2611-ec354b516999",
            "port": [],
            "tag": []
          },
          "sensitive_attributes": [],
          "private": "bnVsbA=="
        }
      ]
    }
  ]
}

Terraform debug file: terraform.log.gz

Manually I have successfully tagged the VM port as the same user in NSX-T GUI.

Reproduction steps

1. Big environment with > 1k VM's
2. Try to tag some VM's port in segments
3.
...

Expected behavior

Add tag on Virtual Machine port in segments

Additional context

Add tag on VM (only VM, not port) work fine in all NSX-T

resource "nsxt_policy_vm_tags" "vm1_tags" {
  instance_id = "5007a902-7820-61fb-2611-ec354b516999"
    tag {
      scope = "ncp/cluster"
      tag   =  "k8s-cluster-01"
    }
    tag {
      scope = "ncp/node_name"
      tag   =  "k8s-cluster-01-wrk06"
    }
}

xander-sh avatar Sep 12 '22 12:09 xander-sh

Hi @xander-sh, thank your for the detailed report! I looked at the logs and I see the following: "Found 12 ports for segment 090d3269-9acc-4eb1-8f4b-f1a04fbd0f6a"

After this printout I'm expecting to see the following printout: Updating port with tags

However its not present in the log, and the cause is usually that none of those ports belongs to listed VM. Is it possible that VM id is wrong? While we don't expect provider to error out in such case, I'm planning to add a log printout that indicates no ports we found to be updated.

annakhm avatar Sep 21 '22 23:09 annakhm

Hi, no, VM is correct. VM id - 5007a902-7820-61fb-2611-ec354b516999

xander-sh avatar Sep 22 '22 06:09 xander-sh

Hi @xander-sh, thank your for the detailed report! I looked at the logs and I see the following: "Found 12 ports for segment 090d3269-9acc-4eb1-8f4b-f1a04fbd0f6a"

After this printout I'm expecting to see the following printout: Updating port with tags

However its not present in the log, and the cause is usually that none of those ports belongs to listed VM. Is it possible that VM id is wrong? While we don't expect provider to error out in such case, I'm planning to add a log printout that indicates no ports we found to be updated.

Add some inventory request from NSX-T VM id is correct and port of VM and port is connected to the desired segment

curl -s -ku 'user:passwd' 'https://x.x.x.x/policy/api/v1/infra/realized-state/virtual-machines?sort_ascending=false&enforcement_point_path=%2Finfra%2Fsites%2Fdefault%2Fenforcement-points%2Fdefault&include_mark_for_delete_objects=false'
{
    "host_id" : "a6b6d6c5-65f4-45fc-b6bb-4df432582c13",
    "source" : {
      "target_id" : "a6b6d6c5-65f4-45fc-b6bb-4df432582c13",
      "target_display_name" : "nodeesx0001",
      "target_type" : "HostNode",
      "is_valid" : true
    },
    "external_id" : "5007a902-7820-61fb-2611-ec354b516999",
    "power_state" : "VM_RUNNING",
    "local_id_on_host" : "360",
    "compute_ids" : [ "moIdOnHost:360", "hostLocalId:360", "locationId:564da866-3b4d-6465-3e47-2dd433806fe0", "instanceUuid:5007a902-7820-61fb-2611-ec354b516999", "externalId:5007a902-7820-61fb-2611-ec354b516999", "biosUuid:4207cfd3-da89-2ad6-c545-575a19c
46581" ],
    "type" : "REGULAR",
    "guest_info" : {
      "os_name" : "Ubuntu Linux (64-bit)",
      "computer_name" : "k8s-cluster-01-wrk06"
    },
    "resource_type" : "VirtualMachine",
    "display_name" : "k8s-cluster-01-wrk06",
    "_last_sync_time" : 1663776977371
  }
curl -ku 'user:passwd'  https://x.x.x.x/policy/api/v1/infra/segments/090d3269-9acc-4eb1-8f4b-f1a04fbd0f6a/ports/default:b8ae1a00-7e3c-4519-be48-26c16dc96767
{
  "attachment" : {
    "id" : "e071c0d7-7366-41e9-a75d-3f125fa2e37f",
    "traffic_tag" : 0,
    "hyperbus_mode" : "DISABLE"
  },
  "admin_state" : "UP",
  "resource_type" : "SegmentPort",
  "id" : "default:b8ae1a00-7e3c-4519-be48-26c16dc96767",
  "display_name" : "k8s-cluster-01-wrk06.vmx@e071c0d7-7366-41e9-a75d-3f125fa2e37f",
  "tags" : [ ],
  "path" : "/infra/segments/090d3269-9acc-4eb1-8f4b-f1a04fbd0f6a/ports/default:b8ae1a00-7e3c-4519-be48-26c16dc96767",
  "relative_path" : "default:b8ae1a00-7e3c-4519-be48-26c16dc96767",
  "parent_path" : "/infra/segments/090d3269-9acc-4eb1-8f4b-f1a04fbd0f6a",
  "unique_id" : "b8ae1a00-7e3c-4519-be48-26c16dc96767",
  "realization_id" : "b8ae1a00-7e3c-4519-be48-26c16dc96767",
  "marked_for_delete" : false,
  "overridden" : false,
  "_create_time" : 1662648266505,
  "_create_user" : "system",
  "_last_modified_time" : 1662987179931,
  "_last_modified_user" : "[email protected]",
  "_system_owned" : false,
  "_protection" : "NOT_PROTECTED",
  "_revision" : 9
}%

xander-sh avatar Sep 22 '22 06:09 xander-sh

Looks like the attachment on provided port has a different id?

    "attachment" : {
    "id" : "e071c0d7-7366-41e9-a75d-3f125fa2e37f",
    "traffic_tag" : 0,
    "hyperbus_mode" : "DISABLE"
  },```

annakhm avatar Sep 22 '22 17:09 annakhm

Looks like the attachment on provided port has a different id?

    "attachment" : {
    "id" : "e071c0d7-7366-41e9-a75d-3f125fa2e37f",
    "traffic_tag" : 0,
    "hyperbus_mode" : "DISABLE"
  },```

No, this is ID of "attachment", not VM. VM name exist in display_name

"display_name" : "k8s-cluster-01-wrk06.vmx@e071c0d7-7366-41e9-a75d-3f125fa2e37f",

xander-sh avatar Sep 22 '22 19:09 xander-sh

Hi @xander-sh, sorry for delay in response. Could you please provide relevant data for /infra/realized-state/enforcement-points/default/vifs API in this environment?

annakhm avatar Oct 21 '22 18:10 annakhm

Some background: Based on description of the issue, I suspected pagination might not be working properly on NSX on vifs API, similar to problem described here https://github.com/vmware/terraform-provider-nsxt/blob/master/nsxt/resource_nsxt_policy_vm_tags.go#L67

However, I did some tests and looks like pagination is working as expected on smaller page sizes (unfortunately I don't have access to environment that can accommodate 1K+ VMs) . @xander-sh, if you could also run the vifs API mentioned above on big env and note result_count and cursor attributes in response, that would help troubleshoot this issue further. Thank you!

annakhm avatar Oct 21 '22 20:10 annakhm

Some background: Based on description of the issue, I suspected pagination might not be working properly on NSX on vifs API, similar to problem described here https://github.com/vmware/terraform-provider-nsxt/blob/master/nsxt/resource_nsxt_policy_vm_tags.go#L67

However, I did some tests and looks like pagination is working as expected on smaller page sizes (unfortunately I don't have access to environment that can accommodate 1K+ VMs) .

@xander-sh, if you could also run the vifs API mentioned above on big env and note result_count and cursor attributes in response, that would help troubleshoot this issue further. Thank you!

Hello, Do you see this issue?

https://kb.vmware.com/s/article/89437

xander-sh avatar Oct 22 '22 21:10 xander-sh

Thank you for the reference. This would certainly explain behavior you're observing. Unlike the vms API issue, I don't think we can address this in the provider effectively. Upgrading your NSX should solve this problem.

annakhm avatar Oct 25 '22 00:10 annakhm

This issue is documented now: https://github.com/vmware/terraform-provider-nsxt/blob/master/website/docs/guides/faq.html.markdown#vm-tagging-and-port-tagging-is-not-working-on-big-environments Closing this, please reopen if there are more concerns.

annakhm avatar Nov 02 '22 21:11 annakhm

As it was said in issue https://kb.vmware.com/s/article/89437, problem solved after upgrade nsx-t to 3.2.2.1.

xander-sh avatar Apr 19 '23 10:04 xander-sh