ansible.windows
ansible.windows copied to clipboard
KB5004244 appears to have broken win_reboot on Server 2019
SUMMARY
I first noticed this issue with the Out of bound patch Microsoft released last week for the Print Nightmare vulnerability, but it also appears to be happening with the update released yesterday for Server 2019 which replaced it, KB5004244. Once this update is installed the win_reboot module will successfully trigger a reboot but then never detect the system has come back up again. I've verified WinRM is running and reachable but Ansible doesn't detect it automatically. Oddly enough if I manually reboot the machine a second time then once it comes back up Ansible detects this and reconnects, but the next time the system is rebooted by the win_reboot module it fails to detect the server coming back up. It just waits for my reboot_timeout to expire and then fails the task.
I'm seeing this issue when building new machines and installing Windows Updates. If I explicitly exclude KB5004244 however then everything works fine and every win_reboot call works as expected.
ISSUE TYPE
- Bug Report
COMPONENT NAME
ansible.windows.win_reboot
ANSIBLE VERSION
ansible [core 2.11.2]
config file = /build/templates/ansible.cfg
configured module search path = ['/build/templates/playbooks/modules', '/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.9/site-packages/ansible
ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.9.6 (default, Jul 1 2021, 07:39:48) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
jinja version = 3.0.1
libyaml = True
COLLECTION VERSION
# /usr/local/lib/python3.9/site-packages/ansible_collections
Collection Version
--------------- -------
ansible.windows 1.7.0
CONFIGURATION
DEFAULT_CONNECTION_PLUGIN_PATH(/build/templates/ansible.cfg) = ['/build/templates/connection_plugins']
DEFAULT_MODULE_PATH(/build/templates/ansible.cfg) = ['/build/templates/playbooks/modules', '/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
DEFAULT_ROLES_PATH(/build/templates/ansible.cfg) = ['/build/templates/roles']
DEPRECATION_WARNINGS(/build/templates/ansible.cfg) = False
SYSTEM_WARNINGS(/build/templates/ansible.cfg) = False
OS / ENVIRONMENT
Target OS - Windows Server 2019
STEPS TO REPRODUCE
- name: Second update run
ansible.legacy.win_shell: |
Import-Module C:\PSWindowsUpdate\PSWindowsUpdate.psd1 -Force -ErrorAction Stop
Get-WUInstall -NotCategory "Drivers" -Install -AcceptAll -IgnoreReboot -Verbose | Select-Object Result, KB, Size, Title | Format-Table -Autosize | Out-File C:\Windows\Temp\UpdateLog2.log
vars:
ansible_become: yes
ansible_become_method: runas
ansible_become_user: packer
tags:
- windows_all
- name: Second update reboot
ansible.windows.win_reboot:
reboot_timeout: 3600
tags:
- windows_all
EXPECTED RESULTS
TASK [windows_update_powershell : Second update run] ***************************[0m changed: [default][0m TASK [windows_update_powershell : Second update reboot] ************************* changed: [default]
ACTUAL RESULTS
TASK [windows_update_powershell : Second update run] ***************************[0m
changed: [default][0m
TASK [windows_update_powershell : Second update run reboot] ********************[0m
fatal: [default]: FAILED! => {"changed": false, "elapsed": 3612, "msg": "Timed out waiting for last boot time check (timeout=3600)", "rebooted": true}[0m
Done a bit more testing with this and it seems to occur when using an older ISO that hasn't had any updates applied to it for quite some time. I re-tested after slipstreaming KB5003711 and KB5003646 (2021-06 SSU and 2021-06 Cumulative Update) into the ISO and then running the same playbook the reboots are detected successfully.
Not sure what it is about installing KB5004244 without an older update that seems to break the win_reboot detection, but it can at least be worked around by starting out from a more up to date image (Though I'm not sure just how up to date it needs to be to resolve the issue)
Are you able to run with a high verbosity (-vvvvv
) and share the full log. This will probably be a lot of text but it will help show what the pre reboot datestamp was set to and the value of the post reboot datestamp.
Basically how win_reboot
works is that it
- runs
Get-CimInstance -ClassName Win32_OperatingSystem -Property LastBootUpTime).LastBootUpTime.ToFileTime()
to get the current time - reboots the host
- runs the same command from step 1 until the value has changed (host is back online after the reboot)
- Runs a test command to verify the host is back online
It could be that step 3 is never returning a different value for some reason causing it to think the reboot hasn't occurred or it could be that step 4 is never true for x reason. Narrowing down what is causing the check to never complete will help try and identify where the problem is and what we have to do to potentially fix it.
After adding the -vvvv tag I get this until the timeout is hit,
windows2019_corp: TASK [windows_update_powershell : Pre-Patch reboot] ****************************
windows2019_corp: task path: /build/templates/roles/windows_update_powershell/tasks/main.yml:12
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: rebooting server...
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: validating reboot
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.66> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.66
When I check the actual VM I can see that it definitely has actually rebooted and that WinRM is still running.
I've tried using a more up to date ISO file from MSDN that has all updates up to June 2021 included, but it seems that even manually rebooting the VM isn't helping it reconnect anymore either.
The only things I've run prior to this reboot are,
- name: Remove Internet Explorer and SMB1
ansible.windows.win_optional_feature:
name:
- Internet-Explorer-Optional-amd64
- smb1protocol
state: absent
async: 600
poll: 30
retries: 3
delay: 10
tags:
- windows_all
- name: Enable .Net Framework 3.5
ansible.windows.win_feature:
name:
- NET-Framework-Features
- NET-Framework-Core
source: D:\sources\sxs
state: present
async: 600
poll: 30
retries: 3
delay: 10
tags:
- windows_all
I reverted back to my older ISO and the pre-patch reboot now does work,
windows2019_corp: TASK [windows_update_powershell : Pre-Patch reboot] ****************************
windows2019_corp: task path: /build/templates/roles/windows_update_powershell/tasks/main.yml:12
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: rebooting server...
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: validating reboot
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: ansible.windows.win_reboot: attempting to get system boot time
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: <10.134.192.67> ESTABLISH WINRM CONNECTION FOR USER: packer on PORT 5985 TO 10.134.192.67
windows2019_corp: ansible.windows.win_reboot: attempting post-reboot test command
windows2019_corp: EXEC (via pipeline wrapper)
windows2019_corp: ansible.windows.win_reboot: system successfully rebooted
windows2019_corp: changed: [default] => {
windows2019_corp: "changed": true,
windows2019_corp: "elapsed": 166,
windows2019_corp: "rebooted": true
windows2019_corp: }
windows2019_corp:
5 v's should show the command result info which is what is what I'm hoping to see here.
Sorry about that! That's what I get for typing rather than pasting! That has run through now and you weren't joking about a lot of text. I've attached the log file due to the length. As before the reboot after applying the update ceases to reconnect, whilst multiple reboots prior to that completed fine.
The WinRM service is being configured for Automated Delayed startup prior to any reboots taking place.
Thanks for that, there's a few things to note:
- Your collection list indicates you are using
ansible.windows==1.7.0
but the output is showing that it's using older code sowin_reboot
is being prioritised from some other path- For example the command used to get the boot time is
(Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime
and not the Get-CimInstance one. - Some of the output would indicate it's using whatever is at
/root/.ansible/collections/ansible_collections/ansible/windows
- For example the command used to get the boot time is
- The reason why
win_reboot
is timing out is not because it's getting the same boot time but rather it's failing to reconnect to host during the reboot- I see the host go down and the remaining errors are
Connection to 10.134.192.74 timed out. (connect timeout=6)
- There's no more reconnections, it simply keeps on seeing connection timeouts until
win_reboot
itself times out
- I see the host go down and the remaining errors are
At this point I suspect either the WinRM
service didn't come back up properly or maybe Ansible isn't resetting the connection properly. Some things to try and check
- Watch the actual console screen of the host and see if the logon screen is actually appearing while Ansible is waiting for it to come back online
- Try random
win_ping
requests while Ansible is waiting for it to come back online- This would be done in a separate process just to see if maybe the connection Ansible is using is getting jammed in some perculiar way
- If this also fails with connection timeouts then Ansible isn't the problem but something on the host (updates on 2016/2019 can take forever to properly install sometimes)
- Try to update the collection to 1.7.0
- As mentioned earlier the log output indicates you are using an older version of the collection and not 1.7.0
- The new version has a refactored setup of
win_reboot
which should be smarter around handling connection resets and checking for it to be online
- Try using
ansible_connection: psrp
instead of WInRM- They both operate on WinRM but PSRP uses the PowerShell remoting layer
- Might be a bit smarter/more efficient and could potentially solve the problem - unlikely but worth a shot
I am also curious though, is there a reason why you are using PSWindowsUpdates
to install the updates rather than the win_updates
module. The module has gone through quite a large refactor in 1.7.0 which should fix some of the problems that are out there but it would be good to find out if there were other issues I wasn't aware about.
Thanks for the guidance, I'll give those a try and come back with some results!
In regards to the collection, I suspect the older one is left from when we were previously using an older version of Ansible that made use of Python 2. After I updated my containers to remove that Ansible it looks like it left behind that older collection. I've removed that now so hopefully it should now direct to the 1.7.0 collection.
In regards to PSWindowsUpdates, I was having issues where win_updates was seeming to fail with installing some cumulative updates correctly. I raised an issue but I don't think anyone else was able to replicate it. Now that I have switched to using newer Windows ISO files though it's likely worth me revisiting that option. It's also possible my issue was with using the older 1.5.0 collection, again due to the issues with the older Ansible instance.
So I removed the old collection as suggested and ensured that only ansible.windows.win_reboot was being used in all playbooks and re-ran the test.
As before the process seemed to stop after the second update reboot stage with the constant connection time outs. I was monitoring the VM console and confirmed the system had rebooted and come back to the login prompt. I left it on this for 25 minutes then from the same location that was running the build I ran a simple win_ping playbook,
- name: Ping test
ansible.windows.win_ping:
But this also failed to connect,
ansible-playbook [core 2.11.2]
config file = None
configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.9/site-packages/ansible
ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible-playbook
python version = 3.9.6 (default, Jul 1 2021, 07:39:48) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
jinja version = 3.0.1
libyaml = True
No config file found; using defaults
setting up inventory plugins
host_list declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
script declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
auto declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Parsed /etc/ansible/hosts inventory source with ini plugin
Loading collection ansible.windows from /usr/local/lib/python3.9/site-packages/ansible_collections/ansible/windows
Loading callback plugin default of type stdout, v2.0 from /usr/local/lib/python3.9/site-packages/ansible/plugins/callback/default.py
Attempting to use 'default' callback.
Skipping callback 'default', as we already have a stdout callback.
Attempting to use 'junit' callback.
Attempting to use 'minimal' callback.
Skipping callback 'minimal', as we already have a stdout callback.
Attempting to use 'oneline' callback.
Skipping callback 'oneline', as we already have a stdout callback.
Attempting to use 'tree' callback.
PLAYBOOK: ping.yml *************************************************************
Positional arguments: ping.yml
verbosity: 5
connection: smart
timeout: 10
become_method: sudo
tags: ('all',)
inventory: ('/etc/ansible/hosts',)
forks: 5
1 plays in ping.yml
PLAY [10.134.192.76] ***********************************************************
TASK [Gathering Facts] *********************************************************
task path: /home/ansible/ping.yml:2
redirecting (type: modules) ansible.builtin.setup to ansible.windows.setup
Using module file /usr/local/lib/python3.9/site-packages/ansible_collections/ansible/windows/plugins/modules/setup.ps1
Pipelining is enabled.
<10.134.192.76> ESTABLISH WINRM CONNECTION FOR USER: administrator on PORT 5985 TO 10.134.192.76
<10.134.192.76> WINRM CONNECT: transport=basic endpoint=http://10.134.192.76:5985/wsman
<10.134.192.76> WINRM CONNECTION ERROR: HTTPConnectionPool(host='10.134.192.76', port=5985): Max retries exceeded with url: /wsman (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fcaec5b2340>, 'Connection to 10.134.192.76 timed out. (connect timeout=120)'))
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 169, in _new_conn
conn = connection.create_connection(
File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 96, in create_connection
raise err
File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 86, in create_connection
sock.connect(sa)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 234, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/local/lib/python3.9/http/client.py", line 1257, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/local/lib/python3.9/http/client.py", line 1303, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.9/http/client.py", line 1252, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.9/http/client.py", line 1012, in _send_output
self.send(msg)
File "/usr/local/lib/python3.9/http/client.py", line 952, in send
self.connect()
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 200, in connect
conn = self._new_conn()
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0x7fcaec5b2340>, 'Connection to 10.134.192.76 timed out. (connect timeout=120)')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='10.134.192.76', port=5985): Max retries exceeded with url: /wsman (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fcaec5b2340>, 'Connection to 10.134.192.76 timed out. (connect timeout=120)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/ansible/plugins/connection/winrm.py", line 426, in _winrm_connect
self.shell_id = protocol.open_shell(codepage=65001) # UTF-8
File "/usr/local/lib/python3.9/site-packages/winrm/protocol.py", line 166, in open_shell
res = self.send_message(xmltodict.unparse(req))
File "/usr/local/lib/python3.9/site-packages/winrm/protocol.py", line 243, in send_message
resp = self.transport.send_message(message)
File "/usr/local/lib/python3.9/site-packages/winrm/transport.py", line 334, in send_message
response = self._send_message_request(prepared_request, message)
File "/usr/local/lib/python3.9/site-packages/winrm/transport.py", line 339, in _send_message_request
response = self.session.send(prepared_request, timeout=self.read_timeout_sec)
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 504, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='10.134.192.76', port=5985): Max retries exceeded with url: /wsman (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fcaec5b2340>, 'Connection to 10.134.192.76 timed out. (connect timeout=120)'))
fatal: [10.134.192.76]: UNREACHABLE! => {
"changed": false,
"msg": "basic: HTTPConnectionPool(host='10.134.192.76', port=5985): Max retries exceeded with url: /wsman (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fcaec5b2340>, 'Connection to 10.134.192.76 timed out. (connect timeout=120)'))",
"unreachable": true
}
PLAY RECAP *********************************************************************
10.134.192.76 : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
Oddly standard pings to the server had suddenly stopped working as well. This included from another machine on the same network. On a hunch I went into the server in progress of being built and turned off the Windows Firewall, as soon as I did that Ansible was able to connect again and my in-progress win_reboot immediately connected and proceeded with the next tasks.
My current assumption is that something about KB5004244 may be making some changes to the Windows Firewall that is preventing WinRM from being able to reconnect after installation (As everything works fine up until the point that KB is installed). What's odd is that, looking at the Windows Firewall, I can see the built-in rules for Windows Remote Management are both enabled already.
Update - So ran another test and this time when I reached the problem reboot stage rather than disabling the Windows Firewall I created an additional WinRM rule allowing traffic on TCP Port 5985 on all network profile's and set the Remote IP Address scope to the IP of my Ansible server. As soon as the rule was created Ansible was able to reconnect and progress.
I'll need to do some additional before/after testing to try and pin down what exactly is changing but it definitely seems that the KB is making some kind of alteration to the built-in WinRM Firewall rules.