google.cloud
google.cloud copied to clipboard
[Bug] Compute instance always reports as changed
SUMMARY
Creating a compute instance reports back as changed
even if it was already created in a previous run.
This results in a non idempotent behaviour which is usually not anticipated for Ansible modules unless mentioned otherwise in the documentation.
This was tested against master
and version 1.0.1
of this collection.
The issue is related to #257 but that one was closed by the author without solving the root issue. Pinging @Rylon who was also involved in the previous issue.
ISSUE TYPE
- Bug Report
COMPONENT NAME
gcp_compute_instance.py
ANSIBLE VERSION
ansible 2.10.2
config file = /home/user/git-repo/policy/ansible.cfg
configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /home/user/git-repo/policy/venv/lib/python3.7/site-packages/ansible
executable location = /home/user/git-repo/policy/venv/bin/ansible
python version = 3.7.8 (default, Jun 29 2020, 05:44:46) [GCC 7.5.0]
CONFIGURATION
DEFAULT_HOST_LIST(/home/user/git-repo/policy/ansible.cfg) = ['/home/user/git-repo/policy/inventories']
DEFAULT_REMOTE_USER(/home/user/git-repo/policy/ansible.cfg) = ans
DEFAULT_ROLES_PATH(/home/user/git-repo/policy/ansible.cfg) = ['/home/user/.ansible/roles', '/usr/share/ansible/roles', '/etc/ansible/roles']
DEFAULT_VAULT_IDENTITY_LIST(/home/user/git-repo/policy/ansible.cfg) = ['[email protected]_pass.production', '[email protected]_pass.testing', '[email protected]_pass.development']
INTERPRETER_PYTHON(/home/user/git-repo/policy/ansible.cfg) = auto
INVENTORY_ENABLED(/home/user/git-repo/policy/ansible.cfg) = ['host_list', 'script', 'auto', 'yaml', 'ini', 'toml', 'gcp_compute']
OS / ENVIRONMENT
Running on Ubuntu 18.04.
STEPS TO REPRODUCE
Taken from the Ansible documentation with minor modifications.
#!ansible-playbook
---
- name: Create an instance
hosts: localhost
gather_facts: no
vars:
gcp_project: your-project
gcp_cred_kind: application
zone: "us-central1-a"
region: "us-central1"
tasks:
- name: create a disk
gcp_compute_disk:
name: 'disk-instance'
size_gb: 20
source_image: 'projects/ubuntu-os-cloud/global/images/family/ubuntu-1604-lts'
zone: "{{ zone }}"
project: "{{ gcp_project }}"
auth_kind: "{{ gcp_cred_kind }}"
scopes:
- https://www.googleapis.com/auth/compute
state: present
register: disk
- name: create a address
gcp_compute_address:
name: 'address-instance'
region: "{{ region }}"
project: "{{ gcp_project }}"
auth_kind: "{{ gcp_cred_kind }}"
scopes:
- https://www.googleapis.com/auth/compute
state: present
register: address
- name: create a instance
gcp_compute_instance:
state: present
name: test-vm
machine_type: n1-standard-1
disks:
- auto_delete: true
boot: true
source: "{{ disk }}"
network_interfaces:
- network: null # use default
access_configs:
- name: 'External NAT'
nat_ip: "{{ address }}"
type: 'ONE_TO_ONE_NAT'
zone: "{{ zone }}"
project: "{{ gcp_project }}"
auth_kind: "{{ gcp_cred_kind }}"
scopes:
- https://www.googleapis.com/auth/compute
register: instance
- name: Wait for SSH to come up
wait_for: host={{ address.address }} port=22 delay=10 timeout=60
EXPECTED RESULTS
In the first run all create tasks should be listed as changed
in the log, but in a subsequent run these tasks should all report ok
and not changed
.
This is the case for the modules gcp_compute_address
and gcp_compute_disk
but not for gcp_compute_instance
.
Below you find the two hypotheical runs showing the expected behaviour.
First run:
./gcp-issue.yml
PLAY [Create an instance] *****************************************************************************************
TASK [create a disk] **********************************************************************************************
changed: [localhost]
TASK [create a address] *******************************************************************************************
changed: [localhost]
TASK [create a instance] ******************************************************************************************
changed: [localhost]
TASK [Wait for SSH to come up] ************************************************************************************
ok: [localhost]
PLAY RECAP ********************************************************************************************************
localhost : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Second run (right after without changing anything):
./gcp-issue.yml
PLAY [Create an instance] *****************************************************************************************
TASK [create a disk] **********************************************************************************************
ok: [localhost]
TASK [create a address] *******************************************************************************************
ok: [localhost]
TASK [create a instance] ******************************************************************************************
ok: [localhost]
TASK [Wait for SSH to come up] ************************************************************************************
ok: [localhost]
PLAY RECAP ********************************************************************************************************
localhost : ok=4 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
NOTE: The difference between the two runs should be that the task create instance
is not marked as changed
.
ACTUAL RESULTS
What happens instead is that the task create instance
is always marked as changed even if the VM was just created in the previous run and no modifications were performed.
See the log output with increased verbosity as a gist here.
Further digging into the gcp_compute_instance.py
reveals that the is_different()
method does not work properly. Or to be more precise, it correctly reports that the request and response are different on a syntax level, but they shouldn't really.
For example, the module uses this address prefix https://compute.googleapis.com/
for the machine type where the response uses https://www.googleapis.com/
. The requested versus response values contained in their respective dictionaries (request_vals
, response_vals
) are listed below.
Requested values
{
'disks': [{
'autoDelete': True,
'boot': True,
'source': 'https://www.googleapis.com/compute/v1/projects/your-project/zones/us-central1-a/disks/disk-instance'
}],
'machineType': 'https://compute.googleapis.com/compute/v1/projects/your-project/zones/us-central1-a/machineTypes/n1-standard-1',
'name': 'test-vm',
'networkInterfaces': [{
'accessConfigs': [{
'name': 'External NAT',
'natIP': '34.123.210.249',
'type': 'ONE_TO_ONE_NAT'
}]
}]
}
Response values:
{
'disks': [{
'autoDelete': True,
'boot': True,
'source': 'https://www.googleapis.com/compute/v1/projects/your-project/zones/us-central1-a/disks/disk-instance'
}],
'machineType': 'https://www.googleapis.com/compute/v1/projects/your-project/zones/us-central1-a/machineTypes/n1-standard-1',
'name': 'test-vm',
'networkInterfaces': [{
'accessConfigs': [{
'name': 'External NAT',
'natIP': '34.123.210.249',
'type': 'ONE_TO_ONE_NAT',
'networkTier': 'PREMIUM'
}],
'network': 'https://www.googleapis.com/compute/v1/projects/your-project/global/networks/default',
'networkIP': '10.128.0.34',
'subnetwork': 'https://www.googleapis.com/compute/v1/projects/your-project/regions/us-central1/subnetworks/default'
}]
}
From here onwards I am not sure how to proceed, because one could update the module to ignore certain values or introduce equivalence mappings between certain values (e.g. for the address prefixes). I would appreciate some pointers into the right direction or a statement, whether this behaviour is anticipated as you are probably facing a similar behaviour internally as well.
Thanks!
Every elephant as he grows, Learns to keep on his toes In his element as he goes, Bump Bump Bumpety Bump... BUMP
Stolen from @neilmartin83
CC @rambleraptor
2 years later this issue is still present ☹️
2 years later this issue is still present frowning_face
Hello! Would you mind trying this with the 1.1.0-beta0 release?
There was a perma-diff in compute instance and several other resources previously, it should have been fixed with :https://github.com/ansible-collections/google.cloud/commit/0fc41bbda4f16fe73edffb08e51d9435262c7b47.
There's an integration test that passes for this as well.
I'll close this for now as I can't reproduce, and there's a passing integration test to validate this.
Taking a look at the example, the diff is coming from the domain difference (www.googleapis.com vs compute.googleapis.com), which was precisely the bug fixed in the hash above.
Feel free to ping me if you do have a repro, even with the latest code.
I have found two cases where this behavior appears:
- Image the following task sequence:
- name: Create VM
gcp_compute_instance:
...
machine_type: big-machine-type
- name: Downsize VM to save money
shell: |
gcloud <shutdown VM>
gcloud <update-machine-type<
gcloud <start VM>
If you run the above twice, the first task will return as changed, probably due to the mismatch in the machine type. This is not expected, as gcp_compute_instance
does not update the machine type if it is different in reality compared to the ansible module invocation. I don't know which is the bug, the fact that it returns as changed, or the fact that machine_type does not get updated.
- If the network is specified like this:
- gcp_compute_instance:
...
network_interfaces:
- network:
selfLink: global/networks/my-network
....
Then, the task will always return as changed. The above works fine. In order to make it not return as changed, the selfLink needs to be defined as the full url:
selfLink: https://www.googleapis.com/compute/v1/projects/en2720-2017/global/networks/my-network
The above cases are not covered by the tests.