openwisp-firmware-upgrader
openwisp-firmware-upgrader copied to clipboard
[bug] SSL error: decryption failed or bad record mac during firmware upgrade
I'm getting this issue during assigning firmware to a device, running on Debian 11 and installed using the ansible script, using psql 13 as db backend.
This resulted in the firmware upgrade process on OpenWISP being stuck at status in progress
and never timed out.
Wed Jul 12 03:00:15 UTC 2023 upgrade: Saving config files...
Wed Jul 12 03:00:15 UTC 2023 upgrade: Commencing upgrade. Closing all shell sessions.
[2023-07-12 03:00:16,074: INFO/ForkPoolWorker-1] # SSL error: decryption failed or bad record mac
[2023-07-12 03:00:16,078: ERROR/ForkPoolWorker-1] Task openwisp_firmware_upgrader.tasks.upgrade_firmware[b2648d2f-04b7-4b82-98d7-97b6ea497818] raised unexpected: InterfaceError('connection already closed')
Traceback (most recent call last):
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
psycopg2.OperationalError: SSL error: decryption failed or bad record mac
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 554, in upgrade
upgrader.upgrade(self.image.file)
File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/upgraders/openwrt.py", line 52, in upgrade
self._reflash(remote_path)
File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/upgraders/openwrt.py", line 283, in _reflash
self.upgrade_operation.refresh_from_db()
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 716, in refresh_from_db
db_instance = db_instance_qs.get()
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 492, in get
num = len(clone)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 302, in __len__
self._fetch_all()
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 1507, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 57, in __iter__
results = compiler.execute_sql(
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1361, in execute_sql
cursor.execute(sql, params)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
django.db.utils.OperationalError: SSL error: decryption failed or bad record mac
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 262, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/postgresql/base.py", line 256, in create_cursor
cursor = self.connection.cursor()
psycopg2.InterfaceError: connection already closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/openwisp2/env/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
R = retval = fun(*args, **kwargs)
File "/opt/openwisp2/env/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
return self.run(*args, **kwargs)
File "/opt/openwisp2/env/lib/python3.9/site-packages/celery/app/autoretry.py", line 34, in run
return task._orig_run(*args, **kwargs)
File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/tasks.py", line 32, in upgrade_firmware
operation.upgrade(recoverable=recoverable)
File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 579, in upgrade
self.log_line(cause)
File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 524, in log_line
self.save()
File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 604, in save
result = super().save(*args, **kwargs)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 806, in save
self.save_base(
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 857, in save_base
updated = self._save_table(
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 970, in _save_table
updated = self._do_update(
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 1034, in _do_update
return filtered._update(values) > 0
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 885, in _update
return query.get_compiler(self.db).execute_sql(CURSOR)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1783, in execute_sql
cursor = super().execute_sql(result_type)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1359, in execute_sql
cursor = self.connection.cursor()
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 284, in cursor
return self._cursor()
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 262, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 262, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/postgresql/base.py", line 256, in create_cursor
cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed
this issue get fixed by adding the sslmode: disable
into settings.py
...
DATABASES = {
'default': {
'ENGINE': 'django.contrib.gis.db.backends.postgis',
'NAME': 'openwisp2',
'USER': 'openwisp',
'PASSWORD': 'openwisp',
'HOST': 'localhost',
'PORT': '5432',
'OPTIONS': {
'sslmode': 'disable',
},
}
}
...
@minhng99 is this happening with the latest master or which version?
May be related: #31637 Registering database connections for cleanup on fork.
It's the released Ansible version 22.05.3
openwisp-controller 1.0.3
openwisp-firmware-upgrader 1.0.1
openwisp-ipam 1.0
openwisp-monitoring 1.0.4
openwisp-notifications 1.0.3
openwisp-users 1.0.2
openwisp-utils 1.0.4
postgresql-13-postgis-3-scripts/oldstable,now 3.1.1+dfsg-1+deb11u1 all [installed,automatic]
postgresql-13-postgis-3/oldstable,now 3.1.1+dfsg-1+deb11u1 amd64 [installed]
postgresql-13/oldstable-security,now 13.11-0+deb11u1 amd64 [installed,automatic]
postgresql-client-13/oldstable-security,now 13.11-0+deb11u1 amd64 [installed,automatic]
postgresql-client-common/oldstable,now 225 all [installed,automatic]
postgresql-common/oldstable,now 225 all [installed,automatic]
It seems link to https://github.com/openwisp/openwisp-firmware-upgrader/issues/244#issuecomment-1633116023 as @nemesifier said.
Because the _call_reflash_command is called from a subprocess it causes the issue.
The workaround was to use a subprocess but in fact it never ends because of Paramiko channel recv_exit_status method never released (timeout is not used their => https://github.com/paramiko/paramiko/issues/1787), the root cause seems to be linked to the sysupgrade command that return the exit code after it stops the SSH service.
I think that if we don't use a subprocess but rewrite exec_command
in ssh.py
of the openwisp-controller without recv_exit_status but with alternative solution like here https://stackoverflow.com/questions/35403031/python-paramiko-exec-command-timeout-doesnt-work we could achieve a real timeout and avoid using subprocess.
After some investigation I can confirm that sysupgrade close ssh before returning the exit code:
Fri Aug 30 09:27:45 UTC 2024 upgrade: Commencing upgrade. Closing all shell sessions.
I tried to replace exit_status = stdout.channel.recv_exit_status()
by:
stdout.channel.status_event.wait(timeout=timeout)
exit_status = stdout.channel.exit_status
self.log('exit status {0}'.format(exit_status))
ssh_client.close()
And it just suceed