dnsrobocert icon indicating copy to clipboard operation
dnsrobocert copied to clipboard

Auth hook fails because of missing configuration file?

Open JaneJeon opened this issue 2 years ago • 7 comments

Hi, I've been using dnsrobocert with no problem, but recently it has been failing to actually run the auth, not because of misconfiguration or incorrect DNS, but because of some... missing temp file??

Renewing an existing certificate for $site
Hook '--manual-auth-hook' for $site reported error code 1
Hook '--manual-auth-hook' for $site ran with error output:
 2022-03-26 02:31:37 fee66d534ec0 dnsrobocert.core.config[50] ERROR Configuration file /tmp/tmpa18ssoa9/dnsrobocert-runtime.yml does not exist.
 Error occured while loading the configuration file, aborting the `auth` hook.

And since the auth hook fails, the cert renewal fails... It's been working fine before this, any ideas?

JaneJeon avatar Apr 13 '22 21:04 JaneJeon

I have this error too. It can be reproduced steps:

  • Start docker container and issue certificate
  • In folder /etc/letsencrypt/renewal/ it will be created config file with parameter deploy -c "/tmp/tmp1w7449xr/dnsrobocert-runtime.yml
  • Restart docker container and dnsrobocert will not be update the certificate because time has not come. But tmp config path will be recreated and path is changed.
  • check config file in /etc/letsencrypt/renewal/domain.com.conf and we will see that parameter to tmp file is not changed.
  • When it's time to update certificate, it's used the old file path, and we have error: ERROR Configuration file /tmp/tmpa18ssoa9/dnsrobocert-runtime.yml does not exist.
  • After this the certificate is not issued and we have error:
Hook '--manual-cleanup-hook' for domain.com ran with error output:
 2022-07-12 17:41:25 server-host dnsrobocert.core.config[84] ERROR Configuration file /tmp/tmpakp5917q/dnsrobocert-runtime.yml does not exist.
 Error occured while loading the configuration file, aborting the `cleanup` hook.
Failed to renew certificate domain.com with error: Some challenges have failed.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
All renewals failed. The following certificates could not be renewed:
  /etc/letsencrypt/live/domain.com/fullchain.pem (failure)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1 renew failure(s), 0 parse failure(s)
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /etc/letsencrypt/logs/letsencrypt.log or re-run Certbot with -v for more details.
----------
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.9/site-packages/dnsrobocert/core/background.py", line 48, in run
    schedule.run_pending()
  File "/usr/local/lib/python3.9/site-packages/schedule/__init__.py", line 780, in run_pending
    default_scheduler.run_pending()
  File "/usr/local/lib/python3.9/site-packages/schedule/__init__.py", line 100, in run_pending
    self._run_job(job)
  File "/usr/local/lib/python3.9/site-packages/schedule/__init__.py", line 172, in _run_job
    ret = job.run()
  File "/usr/local/lib/python3.9/site-packages/schedule/__init__.py", line 661, in run
    ret = self.job_func()
  File "/usr/local/lib/python3.9/site-packages/dnsrobocert/core/background.py", line 70, in _renew_job
    certbot.renew(config_path, directory_path, lock)
  File "/usr/local/lib/python3.9/site-packages/dnsrobocert/core/certbot.py", line 127, in renew
    utils.execute(
  File "/usr/local/lib/python3.9/site-packages/dnsrobocert/core/utils.py", line 60, in execute
    raise error
  File "/usr/local/lib/python3.9/site-packages/dnsrobocert/core/utils.py", line 50, in execute
    call(command, shell=shell, env=env)
  File "/usr/local/lib/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/local/bin/python3', '-m', 'dnsrobocert.core.certbot', 'renew', '-n', '--user-agent-comment', 'DNSroboCert/3.20.1', '--preferred-chain', 'ISRG Root X1', '--config-dir', '/etc/letsencrypt', '--deploy-hook', '/usr/local/bin/python3 -m dnsrobocert.core.hooks -t deploy -c "/tmp/tmpfjpcs60w/dnsrobocert-runtime.yml"', '--work-dir', '/etc/letsencrypt/workdir', '--logs-dir', '/etc/letsencrypt/logs']' returned non-zero exit status 1.
  • After this dnsrobocert process is stuck (not exit with error)
  • And docker doesn't restart the container.

So, we always need to restart docker container manually.

P.S. can you add docker heath check for same errors?

Grokon avatar Jul 25 '22 08:07 Grokon

Running into the EXACT same problem again...

JaneJeon avatar Sep 05 '22 08:09 JaneJeon

As stated here we have been experiencing this since version 3.14.0, which fixed a different renewal issue.

As far as I can tell the problem is that the initial certonly call specifies auth, cleanup and deploy hooks in the created temporary directory using the config_path parameter. This first renewal attempt after restart therefore always succeeds. All follow-up renew calls only specify the deploy hook using the config_path parameter. The certbot renew command does not support manual execution, so the manual cleanup and auth hooks cannot be specified using parameters and will always be taken from the renewal configuration when using that command. Since the renewal file is located in the LetsEncrypt directory which is mounted outside the container, it will persist between container restarts. As @Grokon mentioned this will cause subsequent renewals to use the temporary directory path created by the very first certificate request, which does not exist anymore once the container has been restarted.

I see three possible solutions for the issue (note that I have not tested any of these):

  1. Always delete the renewal configuration when the containers are stopped (could be done manually as a workaround too)
  2. Always update the renewal configuration when a new temporary configuration directory is created (= on container start). I would actually have expected the certonly call after startup to do this, but it seems it does not?
  3. Use the same certonly command for all renewal attempts as is used for the initial requests/renewal attempt after restart. This is the official way to renew when using the manual plugin.

Vertganti avatar Sep 07 '22 09:09 Vertganti

I made the change described by @Vertganti to make the same certonly call on renewal as it does when the docker container starts.

I had a few certs nearing renewal and have tested it successfully, but wouldn't mind a couple more confirmations prior to submitting at PR.

I pushed a docker image to justincentanni/dnsrobocert:certonly and have the code in https://github.com/centja1/dnsrobocert/tree/call-certonly-every-time

centja1 avatar Dec 23 '22 13:12 centja1

Thanks @centja1 for adapting the code and providing the image. I have set it up for testing with a certificate that will expire towards the end of this month and will report if it worked then.

Vertganti avatar Jan 12 '23 10:01 Vertganti

The renewal worked! Since no one else is responding I guess you can submit the PR. Hopefully @adferrand will be back and able to merge/review it soon.

Vertganti avatar Jan 30 '23 15:01 Vertganti

This should really be considered. On stable/production systems the problem hits regularly without something like a cron job to restart the container every so often. I think a lot of homelab users just don't notice, as the container/machine is restarted more frequently.

Codelica avatar Apr 14 '23 14:04 Codelica