azure icon indicating copy to clipboard operation
azure copied to clipboard

New module to create/update/delete/attach multiple disks

Open abikouo opened this issue 1 year ago • 7 comments

Testing how to speed disk creation/attachment in order to tackle #847 I just created a new module azure_rm_managedmultipledisk which will be merged into azure_rm_multiple_manageddisks or exists as another module

ISSUE TYPE
  • New module Pull Request

abikouo avatar Aug 05 '22 14:08 abikouo

Hi @abikouo,

I've been testing the azure_rm_managedmultipledisk module and have been encountering an issue where the disks are being created, but aren't being attached to VMs. I'm not sure if it's how I'm iterating through the module or not, but I'm hoping you may have a recommendation to get past this. I've got a couple snippets from the ansible task outputs and the module usage.

Module usage within role:

  - name: Create, tag, and attach additonal volumes on Azure
    when: not mount_exists[index] | bool
    azure_rm_managedmultipledisk:
      managed_disks:
        - client_id: "{{ lookup('env','AZURE_CLIENT_ID') | default(omit, true)  }}"
          secret: "{{ lookup('env','AZURE_SECRET') | default(omit, true)  }}"
          subscription_id: "{{ lookup('env','AZURE_SUBSCRIPTION_ID') | default(omit, true)  }}"
          tenant: "{{ lookup('env','AZURE_TENANT') | default(omit, true)  }}"
          name: "{{ azure_vm_name }}_{{ item.key }}"
          location: "{{ azure_location }}"
          resource_group: "{{ azure_resource_group }}"
          disk_size_gb: "{{ item.value.size }}"
          storage_account_type: "{{ item.value.azure_disk_type | default(azure_disk_type) }}"
          attach_caching: "{{ 'read_write' if (item.value.size | int < 4095) else '' }}"
          zone: "{{ azure_disk_zone | default(omit, true) }}"
      managed_by_extended:
        - name: "{{ azure_vm_name }}"
          resource_group: "{{azure_resource_group}}"
      tags:
        Name:  "{{ azure_vm_name }}_{{ item.key }}"
        Hostname: "{{ ansible_hostname }}"
        MountPoint: "{{ item.value.mount_point }}"
        Instance: "{{ azure_vm_name }}"
    register: azure_volume
    delegate_to: "{{ task_delegation | default(omit, true) }}"
    async: 1000
    poll: 0
    loop: "{{ disk_preset_list }}"
    loop_control:
      index_var: index

  - name: Wait for Azure disk creation task finished
    when: item.skipped is not defined and item.finished == false
    async_status:
      jid: "{{ item.ansible_job_id }}"
    delegate_to: "{{ task_delegation | default(omit, true) }}"
    register: volume_create_status
    until: volume_create_status.finished
    retries: 100
    delay: 10
    loop: "{{ azure_volume.results }}"

Task outputs:

TASK [disk-management : Create, tag, and attach additonal volumes on Azure] ****
changed: [172.16.0.5 -> localhost] => (item={'key': 'test_mnt1', 'value': {'mount_point': '/testmnt1', 'size': 50}})

TASK [disk-management : Wait for Azure disk creation task finished] ************
changed: [172.16.0.5 -> localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '765407197011.1156559', 'results_file': '/home/cloud-user/.ansible_async/765407197011.1156559', 'changed': True, 'item': {'key': 'test_mnt1', 'value': {'mount_point': '/testmnt1', 'size': 50}}, 'ansible_loop_var': 'item', 'index': 0, 'ansible_index_var': 'index'})

Corresponding results file:

{"changed": true, "state": [{"id": "/subscriptions/xxxxxxxxx/resourceGroups/test-deonte-azure-gov/providers/Microsoft.Compute/disks/test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91_test_mnt1", "name": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91_test_mnt1", "location": "usgovvirginia", "tags": {"Name": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91_test_mnt1", "Hostname": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91", "MountPoint": "/testmnt1", "Instance": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91"}, "create_option": "empty", "source_uri": null, "disk_size_gb": 50, "os_type": null, "storage_account_type": "Standard_LRS", "managed_by": null, "max_shares": null, "managed_by_extended": null, "zone": ""}], "invocation": {"module_args": {"managed_disks": [{"name": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91_test_mnt1", "location": "usgovvirginia", "resource_group": "test-deonte-azure-gov", "disk_size_gb": 50, "storage_account_type": "Standard_LRS", "attach_caching": "read_write", "create_option": null, "storage_account_id": null, "source_uri": null, "os_type": null, "zone": null, "lun": null, "max_shares": null}], "managed_by_extended": [{"name": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91", "resource_group": "test-deonte-azure-gov"}], "tags": {"Name": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91_test_mnt1", "Hostname": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91", "MountPoint": "/testmnt1", "Instance": "test-deonte-azure-gov-testmnt1-cebdf5dfee4d6c91"}, "auth_source": "auto", "cloud_environment": "AzureCloud", "api_profile": "latest", "append_tags": true, "state": "present", "profile": null, "subscription_id": null, "client_id": null, "secret": null, "tenant": null, "ad_user": null, "password": null, "cert_validation_mode": null, "adfs_authority_url": null, "log_mode": null, "log_path": null}}}

If you have any advice on my usage of the module I'd appreciate it.

Thanks, Deonte

dleens2 avatar Aug 29 '22 18:08 dleens2

Hi @dleens2

Thanks for the feedback!! I dont know why it is not attaching the disk to the vm. I have done something similar which is working, feel free to review and see on what it differs. Here is the playbook

- hosts: localhost
  gather_facts: no

  vars:
    azure_resource_group: "my-rg-01"

  tasks:
    - name: Manage multiple disks
      azure.azcollection.azure_rm_managedmultipledisk:
        managed_disks: "{{ item.disks }}"
        managed_by_extended:
          - "{{ item.virtual_machine }}"
      register: azure_disks
      async: 1000
      poll: 0
      with_items:
        - virtual_machine:
            name: "testvm-1"
            resource_group: "{{ azure_resource_group }}"
          disks:
            - disk_size_gb: 1
              name: "disk1-xa"
              resource_group: "{{ azure_resource_group }}"
            - disk_size_gb: 1
              name: "disk1-xb"
              resource_group: "{{ azure_resource_group }}"

        - virtual_machine:
            name: "testvm-2"
            resource_group: "{{ azure_resource_group }}"
          disks:
            - disk_size_gb: 1
              name: "disk2-xa"
              resource_group: "{{ azure_resource_group }}"
            - disk_size_gb: 1
              name: "disk2-xb"
              resource_group: "{{ azure_resource_group }}"

        - virtual_machine:
            name: "testvm-3"
            resource_group: "{{ azure_resource_group }}"
          disks:
            - disk_size_gb: 1
              name: "disk3-xa"
              resource_group: "{{ azure_resource_group }}"
            - disk_size_gb: 1
              name: "disk3-xb"
              resource_group: "{{ azure_resource_group }}"

    - name: Wait for disks to be created and attached
      async_status:
        jid: "{{ item.ansible_job_id }}"
      register: attach_disk
      until: attach_disk.finished
      retries: 100
      delay: 5
      loop: "{{ azure_disks.results }}"

    - name: Get disk info
      azure_rm_manageddisk_info:
        name: "{{ item }}"
        resource_group: "{{ azure_resource_group }}"
      register: disk_info
      with_items:
        - "disk1-xa"
        - "disk1-xb"
        - "disk2-xa"
        - "disk2-xb"
        - "disk3-xa"
        - "disk3-xb"

    - debug: var=disk_info

abikouo avatar Sep 01 '22 10:09 abikouo

Hi @abikouo,

Thanks for sending over the example code. I found that the issue was with the disk/VM zone mismatch. I was able to get the disks to attach using your example and our disk-management role after fixing that issue. I'm still having an issue with disk attachments when there are more than 2 disks being attached per VM.

When attempting to attach 3 disks per VM with your example none of the disks will attach. With our disk-management role two of the disks will attach to each VM and the third one will not. This behavior is happening regardless of scale, only 3 VMs with your example and 100 VMs with our role. Are you seeing this same behavior on your side?

Thanks, Deonte

dleens2 avatar Sep 07 '22 15:09 dleens2

Hi @abikouo,

Thanks for sending over the example code. I found that the issue was with the disk/VM zone mismatch. I was able to get the disks to attach using your example and our disk-management role after fixing that issue. I'm still having an issue with disk attachments when there are more than 2 disks being attached per VM.

When attempting to attach 3 disks per VM with your example none of the disks will attach. With our disk-management role two of the disks will attach to each VM and the third one will not. This behavior is happening regardless of scale, only 3 VMs with your example and 100 VMs with our role. Are you seeing this same behavior on your side?

Thanks, Deonte

@dleens2 I fixed the issue where some disks are created but not attached, the process was quietly failing, and now this will raise an issue. I reproduced the issue with some disks created but not attached, it is due to the max capacity reached (The maximum number of data disks allowed to be attached to a VM of size Standard D4S v3 is 8). You can rerun the playbook below and we will see the root cause for your use case.

I have updated the module azure_rm_managed_disk_info for debugging purpose, it will allow to validate right after the async calls if disks are attached to the virtual machines

Playbook

- hosts: localhost
  gather_facts: no

  vars:
    azure_resource_group: "my-rg-01"
    number_disk: 10
    number_vm: 3

  tasks:
    - set_fact:
        managed_disk_config: "{{ lookup('template', 'disk_vm_config.j2') | from_yaml }}"
        managed_disks: []

    - set_fact:
        managed_disks: "{{ managed_disks + [{'name': item.name, 'resource_group': item.resource_group}] }}"
      with_items: "{{ managed_disk_config | map(attribute='disks') | flatten | list }}"
      no_log: true

    - name: Create non attached disk
      azure_rm_manageddisk:
        resource_group: "{{ azure_resource_group }}"
        name: lrs-unmanaged-disk-01
        storage_account_type: Standard_LRS
        os_type: linux
        disk_size_gb: 1

    - name: Manage multiple disks
      azure.azcollection.azure_rm_managedmultipledisk:
        managed_disks: "{{ item.disks }}"
        managed_by_extended:
          - "{{ item.virtual_machine }}"
        state: absent
      register: azure_disks
      async: 1000
      poll: 0
      with_items: "{{ managed_disk_config }}"

    - name: Wait for disks to be created and attached
      async_status:
        jid: "{{ item.ansible_job_id }}"
      register: attach_disk
      until: attach_disk.finished
      retries: 100
      delay: 5
      loop: "{{ azure_disks.results }}"

    - name: Get disk info
      azure_rm_manageddisk_info:
        managed_disks: "{{ managed_disks }}"
      register: disk_info

    - debug:
        msg: >-
          total_disks [{{ managed_disks | default([]) | length }}]
          unmanaged_disk [{{ disk_info.ansible_info.azure_managed_disk | selectattr('managed_by', 'equalto', none) | list | length }}]
          managed_disk [{{ disk_info.ansible_info.azure_managed_disk | rejectattr('managed_by', 'equalto', none) | list | length }}]

disk_vm_config.j2

{% for i in range(number_vm) %}
- disks:
{% for d in range(number_disk) %}
    - disk_size_gb: 1
      name: "stddisk-{{ i + 1 }}-{{ d }}"
      resource_group: "{{ azure_resource_group }}"
{% endfor %}
  virtual_machine:
    name: "testvm-{{ i + 1 }}"
    resource_group: "{{ azure_resource_group }}"
{% endfor %}

Please update the module version and try to attach disk again using your role or the playbook above and keep me informed Thanks

abikouo avatar Sep 09 '22 09:09 abikouo

Hi @abikouo,

Thank you adding this debugging feature to the module. I updated the modules and ran it again with our disk-management role and found that the issue I was running into was with the max disks per VM size, as you suggested. After changing the sizes of my test VMs the azure_rm_managedmultipledisk module worked well. Disks were created and attached quickly.

Thanks

dleens2 avatar Sep 12 '22 10:09 dleens2

I think this is not a complete solution to #847 . To my reading, this PR offers:

image

But #847 is asking for:

image

I posted in the Issue thread for clarification.

ohthehugemanatee avatar Sep 29 '22 13:09 ohthehugemanatee

I think this is not a complete solution to #847 . To my reading, this PR offers:

image

But #847 is asking for:

image

I posted in the Issue thread for clarification.

@ohthehugemanatee your understanding of this PR is not correct, the disk is not attached once created, but we create all disks before and attach them once to the VM, each time you attach a disk to a VM you need to update the VM property, in order to avoid concurrency issues and optimize the process, we attach them once.

abikouo avatar Oct 04 '22 10:10 abikouo

@abikouo creation is async but attach still works in serial? image

xuzhang3 avatar Oct 21 '22 03:10 xuzhang3

@abikouo creation is async but attach still works in serial? image

@xuzhang3 attach disk is performed once per VM, I just add an update in case we doing it for several VMs it is done in parallel

abikouo avatar Oct 21 '22 15:10 abikouo

@abikouo It would be nice to change the name to "azure_rm_multiplemanageddisks"! and add "azure_rm_multiplemanageddisks" to pr-pipeline.yml, Thank you very much!

https://github.com/abikouo/azure/blob/525b930535bb6e5d282274d0eeb7c99d118965e3/pr-pipelines.yml#L72

Fred-sun avatar Oct 26 '22 01:10 Fred-sun

@abikouo It would be nice to change the name to "azure_rm_multiplemanageddisks"! and add "azure_rm_multiplemanageddisks" to pr-pipeline.yml, Thank you very much!

https://github.com/abikouo/azure/blob/525b930535bb6e5d282274d0eeb7c99d118965e3/pr-pipelines.yml#L72

@Fred-sun ok I will do it

abikouo avatar Oct 26 '22 06:10 abikouo

Hi @abikouo,

I tested the new changes to the module and had some errors when running it. I ran it against 50 VMs, attempting to attach 3 disks per VM. On about 1/3 of the VMs included in the playbook, I received the following error during the disk attachment phase: Error updating virtual machine (attaching/detaching disks) <vm-id> - (PropertyChangeNotAllowed) Changing property 'dataDisk.managedDisk.id' is not allowed.\nCode: PropertyChangeNotAllowed\nMessage: Changing property 'dataDisk.managedDisk.id' is not allowed.\nTarget: dataDisk.managedDisk.id

If I rerun the playbook then the disks that failed to attach in the previous run attach successfully. I think we may be running into another API collision issue? Let me know what you think.

Thanks

dleens2 avatar Oct 27 '22 15:10 dleens2

Hi @abikouo,

I tested the new changes to the module and had some errors when running it. I ran it against 50 VMs, attempting to attach 3 disks per VM. On about 1/3 of the VMs included in the playbook, I received the following error during the disk attachment phase: Error updating virtual machine (attaching/detaching disks) <vm-id> - (PropertyChangeNotAllowed) Changing property 'dataDisk.managedDisk.id' is not allowed.\nCode: PropertyChangeNotAllowed\nMessage: Changing property 'dataDisk.managedDisk.id' is not allowed.\nTarget: dataDisk.managedDisk.id

If I rerun the playbook then the disks that failed to attach in the previous run attach successfully. I think we may be running into another API collision issue? Let me know what you think.

Thanks

Hi @dleens2

Thanks for the feedback. The issue you are facing is indeed an API collision issue, according to https://github.com/Azure/azure-sdk-for-java/issues/2925#issuecomment-469432572 The issue here is that the operation " imageVM.update() .withExistingDataDisk(disk)..." is a PATCH operation. It can not be called concurrently on a vm. this is basically the same operation we are performing when attaching disk to VM

I managed to reproduce the issue using the following playbook

- name: Create and Attach disks to virtual machine
  azure.azcollection.azure_rm_multiplemanageddisks:
    managed_disks:
      - "{{ item }}"
    managed_by_extended:
      - name: vm-01
        resource_group: test-rg
  register: azure_disks
  async: 1000
  poll: 0
  with_items:
  - disk_size_gb: 1
    name: disk1
    resource_group: test-rg
  - disk_size_gb: 1
    name: disk2
    resource_group: test-rg
  - disk_size_gb: 1
    name: disk3
    resource_group: test-rg

I don't know how you are running your attachment, you should have one task per virtual machine to avoid API collision. with the playbook above, managed disks are created in parallel, but when trying to attach them to the virtual machine we are facing collision, the new version of the module is already creating managed disks in parallel, so the playbook above should rather turn into

- name: Create and Attach disks to virtual machine
  azure.azcollection.azure_rm_multiplemanageddisks:
    managed_disks:
      - disk_size_gb: 1
        name: disk1
        resource_group: test-rg
      - disk_size_gb: 1
        name: disk2
        resource_group: test-rg
      - disk_size_gb: 1
        name: disk3
        resource_group: test-rg
    managed_by_extended:
      - name: vm-01
        resource_group: test-rg
  register: azure_disks
  async: 1000
  poll: 0

abikouo avatar Oct 28 '22 13:10 abikouo

@abikouo It would be nice to change the name to "azure_rm_multiplemanageddisks"! and add "azure_rm_multiplemanageddisks" to pr-pipeline.yml, Thank you very much!

https://github.com/abikouo/azure/blob/525b930535bb6e5d282274d0eeb7c99d118965e3/pr-pipelines.yml#L72

@Fred-sun the module has been renamed and pr-pipelines.yml and meta/runtime.yml files updated

abikouo avatar Oct 28 '22 13:10 abikouo

Hi @abikouo,

I retried the disk creation and attachment without using a loop as you suggested and it worked without causing any collisions. I also tested for idempotence in subsequent runs and it worked as expected.

Thanks

dleens2 avatar Oct 31 '22 18:10 dleens2

Hi @abikouo,

I retried the disk creation and attachment without using a loop as you suggested and it worked without causing any collisions. I also tested for idempotence in subsequent runs and it worked as expected.

Thanks

@dleens2 thanks for the update.

abikouo avatar Nov 07 '22 11:11 abikouo

@Fred-sun the Pull request is ready for review

abikouo avatar Nov 07 '22 11:11 abikouo

@abikouo Thank you for your update and we will complete the test and move forward with the merger as soon as possible! Thanks!

Fred-sun avatar Nov 09 '22 06:11 Fred-sun

@abikouo test failed:

fatal: [testhost]: FAILED! => {
    "changed": false,
    "errors": [
        "managed disk ansibletest-3bd0fb87-fcf6-4fcf-97e3-867e4f8ecd29/disk-copy-without-source-uri has create_option set to copy but not all required parameters (source_uri) are set.",
        "managed disk ansibletest-3bd0fb87-fcf6-4fcf-97e3-867e4f8ecd29/disk-import-without-storage-account has create_option set to import but not all required parameters (source_uri,storage_account_id) are set.",
        "managed disk ansibletest-3bd0fb87-fcf6-4fcf-97e3-867e4f8ecd29/disk-empty-without-disk-size has create_option set to empty but not all required parameters (disk_size_gb) are set."
    ],
    "invocation": {
        "module_args": {
            "ad_user": null,
            "adfs_authority_url": null,
            "api_profile": "latest",
            "append_tags": true,
            "auth_source": "auto",
            "cert_validation_mode": null,
            "client_id": null,
            "cloud_environment": "AzureCloud",
            "log_mode": null,
            "log_path": null,
            "managed_by_extended": null,
            "managed_disks": [
                {
                    "attach_caching": null,
                    "create_option": "copy",
                    "disk_size_gb": null,
                    "location": null,
                    "lun": null,
                    "max_shares": 3,
                    "name": "disk-copy-without-source-uri",
                    "os_type": null,
                    "resource_group": "ansibletest-3bd0fb87-fcf6-4fcf-97e3-867e4f8ecd29",
                    "source_uri": null,
                    "storage_account_id": null,
                    "storage_account_type": null,
                    "zone": null
                },
                {
                    "attach_caching": null,
                    "create_option": "import",
                    "disk_size_gb": null,
                    "location": null,
                    "lun": null,
                    "max_shares": 3,
                    "name": "disk-import-without-storage-account",
                    "os_type": null,
                    "resource_group": "ansibletest-3bd0fb87-fcf6-4fcf-97e3-867e4f8ecd29",
                    "source_uri": null,
                    "storage_account_id": null,
                    "storage_account_type": null,
                    "zone": null
                },
                {
                    "attach_caching": null,
                    "create_option": "empty",
                    "disk_size_gb": null,
                    "location": null,
                    "lun": null,
                    "max_shares": 3,
                    "name": "disk-empty-without-disk-size",
                    "os_type": null,
                    "resource_group": "ansibletest-3bd0fb87-fcf6-4fcf-97e3-867e4f8ecd29",
                    "source_uri": null,
                    "storage_account_id": null,
                    "storage_account_type": null,
                    "zone": null
                }
            ],
            "password": null,
            "profile": null,
            "secret": null,
            "state": "present",
            "subscription_id": null,
            "tags": null,
            "tenant": null,
            "thumbprint": null,
            "x509_certificate_path": null
        }
    },
    "msg": "Some required options are missing from managed disks configuration."
}



xuzhang3 avatar Nov 25 '22 09:11 xuzhang3

disk-copy-without-source-uri

@xuzhang3 how are you running the tests? This task is falling as expected, as there is an ignore_errors: true option to catch this error and later validate that. Please provide me with the entire test command. Thanks

abikouo avatar Nov 29 '22 13:11 abikouo

@abikouo Rerun the tests, all passed. LGTM

xuzhang3 avatar Nov 30 '22 03:11 xuzhang3