Provisioning freezes during ansible playbook launch
Provisioning freezes during ansible playbook (method) launch. It doesn't even launch/run the ansible-playbook command.
Any new provisioning requests also gets frozen. This issue exists on Morphy and Lasker.


[----] I, [2022-07-03T13:34:47.582215 #1382:2b07b5895958] INFO -- automation: Q-task_id([r725_miq_provision_1900]) Invoking [playbook] method [/ManageIQ_Custom/Cloud/VM/Provisioning/StateMachines/Methods/testing_vm] with inputs [{"vm_id"=>"nil"}]
[----] I, [2022-07-03T13:34:48.736359 #1382:2b07b5895958] INFO -- automation: Q-task_id([r725_miq_provision_1900]) Playbook Method passing options:
[----] I, [2022-07-03T13:34:48.736915 #1382:2b07b5895958] INFO -- automation: Q-task_id([r725_miq_provision_1900])
---
:become_enabled: false
:execution_ttl: '3'
:extra_vars:
vm_id: nil
:manageiq:
automate_workspace: automate_workspaces/19d36f9e-4722-481c-ac2d-f5a12478cfad
api_url: https://172.17.0.2
api_token: f5992ebdecd767efd2fcae46e753edb7
user: users/1
group: groups/2
X_MIQ_Group: EvmGroup-super_administrator
request_task: requests/725/request_tasks/1900
request: requests/725
:manageiq_connection:
url: https://172.17.0.2
token: f5992ebdecd767efd2fcae46e753edb7
X_MIQ_Group: EvmGroup-super_administrator
:hosts:
- 10.10.10.10
:verbosity: '3'
:credential: 96
[----] I, [2022-07-03T13:34:49.434445 #1382:2b07b5895958] INFO -- automation: Q-task_id([r725_miq_provision_1900]) Ansible job with ref 42342 was created.
[----] I, [2022-07-03T13:34:50.512541 #297:2b1baf6ad95c] INFO -- automation: User [admin] with current group ID [1] name [Tenant My Company/1 access]
[----] I, [2022-07-03T13:34:50.515707 #297:2b1baf6ad95c] INFO -- automation: Instantiating [/System/Process/Event?EventStream%3A%3Aevent_stream=19884&MiqServer%3A%3Amiq_server=4&User%3A%3Auser=1&event_id=19884&event_stream_id=19884&event_type=AWS_EC2_Subnet_UPDATE&object_name=Event&vmdb_object_type=event_stream]
[----] I, [2022-07-03T13:34:50.528811 #297:2b1baf6ad95c] INFO -- automation: Updated namespace [/System/Process/Event?EventStream%3A%3Aevent_stream=19884&MiqServer%3A%3Amiq_server=4&User%3A%3Auser=1&event_id=19884&event_stream_id=19884&event_type=AWS_EC2_Subnet_UPDATE&object_name=Event&vmdb_object_type=event_stream ManageIQ/System]
[----] I, [2022-07-03T13:34:50.549414 #297:2b1baf6ad95c] INFO -- automation: Following Relationship [miqaedb:/System/Event/EmsEvent/AMAZON/AWS_EC2_Subnet_UPDATE#create]
[----] I, [2022-07-03T13:34:50.554425 #297:2b1baf6ad95c] INFO -- automation: Updated namespace [miqaedb:/System/Event/EmsEvent/AMAZON/AWS_EC2_Subnet_UPDATE#create ManageIQ/System/Event/EmsEvent]
[----] I, [2022-07-03T13:34:50.563418 #295:2ad159e6f974] INFO -- automation: User [admin] with current group ID [1] name [Tenant My Company/1 access]
[----] I, [2022-07-03T13:34:50.564429 #297:2b1baf6ad95c] INFO -- automation: Following Relationship [miqaedb:/System/event_handlers/refresh#create]
[----] I, [2022-07-03T13:34:50.565566 #295:2ad159e6f974] INFO -- automation: Instantiating [/System/Process/Event?EventStream%3A%3Aevent_stream=19885&MiqServer%3A%3Amiq_server=4&User%3A%3Auser=1&event_id=19885&event_stream_id=19885&event_type=AWS_EC2_NetworkInterface_CREATE&object_name=Event&vmdb_object_type=event_stream]
[----] I, [2022-07-03T13:34:50.569891 #297:2b1baf6ad95c] INFO -- automation: Updated namespace [miqaedb:/System/event_handlers/refresh#create ManageIQ/System]
[----] I, [2022-07-03T13:34:50.575273 #295:2ad159e6f974] INFO -- automation: Updated namespace [/System/Process/Event?EventStream%3A%3Aevent_stream=19885&MiqServer%3A%3Amiq_server=4&User%3A%3Auser=1&event_id=19885&event_stream_id=19885&event_type=AWS_EC2_NetworkInterface_CREATE&object_name=Event&vmdb_object_type=event_stream ManageIQ/System]
Here is another example of provisioning freezing. It works fine when its just a single provisioning but when multiple VM counts are set, it freezes after the playbook options.
[----] I, [2022-07-12T03:46:20.248847 #307:2ad5997a5974] INFO -- automation: Q-task_id([r821_miq_provision_2087]) Playbook Method passing options:
[----] I, [2022-07-12T03:46:20.249442 #307:2ad5997a5974] INFO -- automation: Q-task_id([r821_miq_provision_2087])
---
:become_enabled: false
:execution_ttl: '3'
:extra_vars:
vm_id: i-03bd18e4d331734b8
:manageiq:
automate_workspace: automate_workspaces/c866a302-ff83-420e-907f-371273f157c3
api_url: https://172.17.0.2
api_token: 88ad01ad237a6d0ac4c27f32098ea2fb
user: users/1
group: groups/2
X_MIQ_Group: EvmGroup-super_administrator
request_task: requests/821/request_tasks/2087
request: requests/821
:manageiq_connection:
url: https://172.17.0.2
token: 88ad01ad237a6d0ac4c27f32098ea2fb
X_MIQ_Group: EvmGroup-super_administrator
:hosts:
- localhost
:verbosity: '3'
:credential: 96
[----] I, [2022-07-12T03:46:20.252724 #309:2abe5625b964] INFO -- automation: Q-task_id([r821_miq_provision_2088]) Updated namespace [Cloud/VM/Provisioning/StateMachines/Methods/testing_vm ManageIQ_Custom/Cloud/VM/Provisioning/StateMachines]
[----] I, [2022-07-12T03:46:20.261804 #309:2abe5625b964] INFO -- automation: Q-task_id([r821_miq_provision_2088]) Invoking [playbook] method [/ManageIQ_Custom/Cloud/VM/Provisioning/StateMachines/Methods/testing_vm] with inputs [{"vm_id"=>"i-092a45b3bddd9a0f2"}]
[----] I, [2022-07-12T03:46:20.525966 #309:2abe5625b964] INFO -- automation: Q-task_id([r821_miq_provision_2088]) Playbook Method passing options:
[----] I, [2022-07-12T03:46:20.526465 #309:2abe5625b964] INFO -- automation: Q-task_id([r821_miq_provision_2088])
---
:become_enabled: false
:execution_ttl: '3'
:extra_vars:
vm_id: i-092a45b3bddd9a0f2
:manageiq:
automate_workspace: automate_workspaces/546e38fc-bd83-41eb-8c8c-a701965eae20
api_url: https://172.17.0.2
api_token: c97d29074a3baac85a7aa2000ab84c8a
user: users/1
group: groups/2
X_MIQ_Group: EvmGroup-super_administrator
request_task: requests/821/request_tasks/2088
request: requests/821
:manageiq_connection:
url: https://172.17.0.2
token: c97d29074a3baac85a7aa2000ab84c8a
X_MIQ_Group: EvmGroup-super_administrator
:hosts:
- localhost
:verbosity: '3'
:credential: 96
[----] I, [2022-07-12T03:46:20.581454 #307:2ad5997a5974] INFO -- automation: Q-task_id([r821_miq_provision_2087]) Ansible job with ref 44048 was created.
[----] I, [2022-07-12T03:46:21.172005 #309:2abe5625b964] INFO -- automation: Q-task_id([r821_miq_provision_2088]) Ansible job with ref 44049 was created.
If I restart the Docker instance, the rest of the provisioning continues.
@kbrock Can you take a look at this one?
@Fryguy @kbrock This issue still persist and it happens when you provisioning more than 1 VM. Playbook executes and provisioning continues when provisioning only 1 VM but we try 2 or 3 VM then the provisioning gets stuck.
I have to restart docker container in order for other requests to continue.
This is still an issue. Sometimes ansible job launches and sometimes it doesn't and other jobs ends up as either stuck in active state or ends up pending even if approved.

irb(main):001:0> pp ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job.find_by(:ems_ref => "66694")
PostgreSQLAdapter#log_after_checkout, connection_pool: size: 5, connections: 1, in use: 1, waiting_in_queue: 0
#<ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job:0x000055fe967719e8
id: 559,
name: "azure_sg_create.yaml",
type: "ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job",
description: nil,
status: "Pre_execute",
ems_ref: "66694",
ancestry: nil,
ems_id: 1,
orchestration_template_id: nil,
created_at: Mon, 08 Aug 2022 15:03:37 UTC +00:00,
updated_at: Mon, 08 Aug 2022 15:03:37 UTC +00:00,
retired: nil,
retires_on: nil,
retirement_warn: nil,
retirement_last_warn: nil,
retirement_state: nil,
retirement_requester: nil,
status_reason: nil,
cloud_tenant_id: nil,
resource_group: nil,
start_time: nil,
finish_time: nil,
configuration_script_base_id: 126,
verbosity: 3,
hosts: nil,
evm_owner_id: nil,
miq_group_id: nil,
tenant_id: nil>
=> #<ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job id: 559, name: "azure_sg_create.yaml", type: "ManageIQ::Providers::EmbeddedAnsible::AutomationMa...", description: nil, status: "Pre_execute", ems_ref: "66694", ancestry: nil, ems_id: 1, orchestration_template_id: nil, created_at: "2022-08-08 15:03:37", updated_at: "2022-08-08 15:03:37", retired: nil, retires_on: nil, retirement_warn: nil, retirement_last_warn: nil, retirement_state: nil, retirement_requester: nil, status_reason: nil, cloud_tenant_id: nil, resource_group: nil, start_time: nil, finish_time: nil, configuration_script_base_id: 126, verbosity: 3, hosts: nil, evm_owner_id: nil, miq_group_id: nil, tenant_id: nil>
@kTipSSIoYv what do you see for the playbook stdout?
pp ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job.find_by(:ems_ref => "66694").raw_stdout_json
@agrare Is there a typo in the command? I see a syntax error.
SyntaxError ((irb):15: syntax error, unexpected local variable or method, expecting `end')
..._ref => "66694")raw_stdout_json
... ^~~~~~~~~~~~~~~
Updated, you need the .
irb(main):016:0> pp ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job.find_by(:ems_ref => "66694").raw_stdout_json
Traceback (most recent call last):
15: from bin/rails:4:in `<main>'
14: from bootsnap (1.11.1) lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
13: from bootsnap (1.11.1) lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
12: from railties (6.0.5.1) lib/rails/commands.rb:18:in `<main>'
11: from railties (6.0.5.1) lib/rails/command.rb:46:in `invoke'
10: from railties (6.0.5.1) lib/rails/command/base.rb:69:in `perform'
9: from thor (1.2.1) lib/thor.rb:392:in `dispatch'
8: from thor (1.2.1) lib/thor/invocation.rb:127:in `invoke_command'
7: from thor (1.2.1) lib/thor/command.rb:27:in `run'
6: from railties (6.0.5.1) lib/rails/commands/console/console_command.rb:102:in `perform'
5: from railties (6.0.5.1) lib/rails/commands/console/console_command.rb:19:in `start'
4: from railties (6.0.5.1) lib/rails/commands/console/console_command.rb:70:in `start'
3: from (irb):15
2: from (irb):16:in `rescue in irb_binding'
1: from activemodel (6.0.5.1) lib/active_model/attribute_methods.rb:429:in `method_missing'
NoMethodError (private method `raw_stdout_json' called for #<ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job:0x000055fe95d4a050>)
Did you mean? raw_stdout
irb(main):017:0> pp ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job.find_by(:ems_ref => "66694").raw_stdout
""
=> ""
Okay someone who actually knows embedded Ansible is going to have to help you sorry @kTipSSIoYv
@kbrock @agrare Now the other jobs are running.

What did the stdout show? What changed to make it work?
@kTipSSIoYv do you mean they're working, or the older ones timed out and now the latest is hung
@jrafanie @agrare after running the pp ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job.find_by(:ems_ref => "66694").raw_stdout command other stuck jobs started to work. It was stuck until then. Note: Playbook fails right now without the variables and that's why the status is error.
@kTipSSIoYv is that reproducible?
@agrare Yea its reproducible. Usually when you launch multiple count of servers using a single catalog it gets stuck during ansible playbook launch.
@Fryguy In the latest version of Narjdorf Ansible job isn't launching at all. I see the following log entry but ansible runner isn't launching at all.
[----] I, [2022-09-05T16:20:05.401602 #324:8cdc] INFO -- automation: Q-task_id([r1169_miq_provision_2912]) Ansible job with ref 92663 was created.
This issue has been automatically marked as stale because it has not been updated for at least 3 months.
If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.
This issue has been automatically marked as stale because it has not been updated for at least 3 months.
If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.
This issue has been automatically marked as stale because it has not been updated for at least 3 months.
If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.
This issue has been automatically marked as stale because it has not been updated for at least 3 months.
If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.