semaphore icon indicating copy to clipboard operation
semaphore copied to clipboard

Problem: cannot decrypt access key

Open felixlabrot opened this issue 9 months ago • 11 comments

Issue

I have copied the Semaphore Docker volumes to a new server and started the Docker Stack. All permissions and file ownerships have been preserved. The UI is accessible as normal and all settings are there.

I have verified that the semaphore user inside the container can in fact read config.json and database.boltdb.

When running a job, Semaphore falsly claims Error: cannot decrypt access key, perhaps encryption key was changed when in fact this is not true. The config.json contains the "access_key_encryption" which hasn't been changed ever before. Every manual job execution fails with "Request failed with status code 500".

My entire installation is broken and major tasks in the datacenter are not running anymore, because of those wrong error messages that do not show a real error.

Impact

Ansible (task execution)

Installation method

Docker

Database

BoltDB

Browser

Firefox

Semaphore Version

v2.14.10-bcdb678-1746645798

Ansible Version

ansible [core 2.18.5]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/semaphore/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /opt/semaphore/apps/ansible/11.1.0/venv/lib/python3.12/site-packages/ansible
  ansible collection location = /home/semaphore/.ansible/collections:/usr/share/ansible/collections
  executable location = /opt/semaphore/apps/ansible/11.1.0/venv/bin/ansible
  python version = 3.12.10 (main, Apr 10 2025, 15:27:01) [GCC 14.2.0] (/opt/semaphore/apps/ansible/11.1.0/venv/bin/python3)
  jinja version = 3.1.6
  libyaml = True

Logs & errors

No additional python dependencies to install Starting semaphore server Loading config Validating config BoltDB /var/lib/semaphore/database.boltdb Tmp Path (projects home) /tmp/semaphore Semaphore v2.14.10-bcdb678-1746645798 Interface Port :3000 Server is running

time="2025-05-28T15:14:11Z" level=error msg="websocket: close sent" error="Cannot send close message" time="2025-05-28T15:14:26Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" error="Cannot write new event to database" time="2025-05-28T16:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-28T17:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-28T18:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-28T19:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-28T20:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-28T21:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-28T22:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-28T23:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T00:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T00:01:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T01:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T02:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T03:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T04:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T05:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T06:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T07:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T08:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T09:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T10:00:00Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" time="2025-05-29T10:20:51Z" level=error msg="websocket: close sent" error="Cannot send close message" time="2025-05-29T10:26:35Z" level=error msg="cannot decrypt access key, perhaps encryption key was changed" error="Cannot write new event to database"

Manual installation - system information

No response

Configuration

No response

Additional information

No response

felixlabrot avatar May 29 '25 10:05 felixlabrot

Hi,

Please check following config option in old and new config.json: access_key_encryption. It must be the same.

I recommend you to use environment variable SEMAPHORE_ACCESS_KEY_ENCRYPTION to provide your encryption key generated with head -c32 /dev/urandom | base64.

If you missed it, you need to update all access keys.

fiftin avatar Jun 01 '25 10:06 fiftin

The config.json file was copied. All it's contents are completely identical. I have also tried to set the environment variable and copy the exact same value to it. It still doesn't work.

felixlabrot avatar Jun 01 '25 15:06 felixlabrot

@felixlabrot do you see access_key_encryption in your config.json?

fiftin avatar Jun 01 '25 16:06 fiftin

Yes, I can assure you I am 1. not dumb and 2. I can read. As mentioned twice, the old and new config files are absolutely identical. Everything is exactly where it needs to be. And when trying to set the environment variable as an alternative, I copied it from that config file. I couldn't have copied it out of there, when it wasn't there.

My solution is now to throw the database away and manually investing 2 hours to create all the repos, secrets, variables and jobs again. After my research this seems to be a years old permanent bug that you're not able to resolve and the only solution is to toss the database and start from scratch.

felixlabrot avatar Jun 01 '25 16:06 felixlabrot

@felixlabrot sorry for this. I will try to reproduce the issue.

fiftin avatar Jun 01 '25 17:06 fiftin

@felixlabrot Why you recreated all templates/repos/variables if had problem only with key store?

You just need update existing keys! Thats all!

I just tested. Only one reason for this error: different access_key_encryption.

fiftin avatar Jun 01 '25 17:06 fiftin

After my research this seems to be a years old permanent bug

@felixlabrot if you mean this issue https://github.com/semaphoreui/semaphore/issues/2261 - you can find it was solved but not closed.

fiftin avatar Jun 01 '25 17:06 fiftin

As I deleted the database, I had to start from scratch. I also switched to MariaDB instead of BoltDB as BoltDB isn't admin friendly at all, doesn't allow interacting with via any SQL CLI and backup is also more reliable with a SQL database. Also a fresh environment guarantees me that there aren't any other hidden problems that cause trouble later on.

I also had the /etc/semaphore volume there already, as it contains the config file.

So there is really something wrong, because as far as I understood there is one table in the database that contains encrypted data and the key is read from the file or env variable. And in my case the key was there is 2 different ways and so it must have been read. And it still failed to decrypt. But as the log output is just in the format "something doesn't work" I can't tell on what step of the decryption process a problem occured.

felixlabrot avatar Jun 01 '25 17:06 felixlabrot

@felixlabrot You can use export/import of the project to not recreate each resource.

fiftin avatar Jun 02 '25 03:06 fiftin

I've used Terraform and https://github.com/CruGlobal/terraform-provider-semaphoreui to help keep my configurations in check for Semaphore. Helps me deploy new projects easily from a YAML config file.

(Edit sorry, not solving the problem, but helping with a repeatable solution to not take hours upon hours to rebuild).

ramorous avatar Jun 03 '25 14:06 ramorous