docker-jitsi-meet icon indicating copy to clipboard operation
docker-jitsi-meet copied to clipboard

LE logic bug on startup

Open Cyborgscode opened this issue 2 years ago • 6 comments

The following patch solves:

  • user creates domain certs outside of docker
  • LE rate limits hit with too many requests per week / domainname / subdomain on docker restarts

Additional Requirement: ( that is the main reason, why it's not a pull request)

openssl inside dockercontainer

File:

docker-jitsi-meet/web/rootfs/etc/cont-init.d/10-config

Diff:

jitsi-meet-10-config.diff.gz

Cyborgscode avatar Oct 12 '21 22:10 Cyborgscode

Main Punch line:

EXPIRE="Certificate will expire"
if [ -x /config/acme-certs/$LETSENCRYPT_DOMAIN/fullchain.pem ];then 
	EXPIRE=$(openssl x509 -noout -in /config/acme-certs/$LETSENCRYPT_DOMAIN/fullchain.pem -checkend 0)
fi
if [ "$EXPIRE" == "Certificate will expire" ]; then 
          ... normal LE startupcode ...

Cyborgscode avatar Oct 12 '21 22:10 Cyborgscode

Can you please make this an actual PR so we can discuss the code instead of pasting diff chunks here? It will be much more productive.

saghul avatar Oct 13 '21 02:10 saghul

Now that I'm fully awake... not sure I understand why you need this. acme.sh already checks if a renewal is necessary.

saghul avatar Oct 13 '21 07:10 saghul

There were two problems.

a) i had LE enabled for a subdomain and it worked for a longer periode in 2020->2021, then one day, LE counted activities on other subdomains of the same sld as an activity which included this jitsi subdomain as well and rate-limited it. From that point on, the docker container could no longer start. On every start it tried LE and failed and stopped. This would not happen, if the once created Certs had been checked. That docker container was regulary updated, but may no represent the code from today.

We had to switch and overwrite the files in /config/web/keys/cert.* with wildcard certs from the maindomain. (...and later reinstalled a new docker image.)

if you can ensure, that acme.sh really handles the cert validity check correctly, this patch is in deed not necessary. In this case, i would suggest to add a comment line for a third returncode of acme.sh , which explains the "third state" of valid, already present certs, because the code does not imply this to the naked eye.

b) the entire concept of checking those le certs or static cert.* does not work, if the config is freshly regenerated. The supplied fix does not handle this part, the admin's updatescript needs to handle this.

The LE rate limit section in the actual selfhosting guide gives a good impression on the problem, but should mention this explicit as a problem for the update process. Example for static keys:

docker-compose down git pull rm -fr ~/.jitsi-meet-cfg/ mkdir -p ~/.jitsi-meet-cfg/{web/letsencrypt,web/keys,transcripts,prosody,jicofo,jvb,jigasi,jibri} cp ~/jitsi-meet-certs/cert.* ~/.jitsi-meet-cfg/web/keys/ # example docker-compose up -d

Cyborgscode avatar Oct 13 '21 09:10 Cyborgscode

a) i had LE enabled for a subdomain and it worked for a longer periode in 2020->2021, then one day, LE counted activities on other subdomains of the same sld as an activity which included this jitsi subdomain as well and rate-limited it. From that point on, the docker container could no longer start. On every start it tried LE and failed and stopped. This would not happen, if the once created Certs had been checked. That docker container was regulary updated, but may no represent the code from today.

We handle acme errors here: https://github.com/jitsi/docker-jitsi-meet/blob/76a16a8b78b309aa71ee295756fc39ca3fe80f25/web/rootfs/etc/cont-init.d/10-config#L32-L53 as you can see we only handle return codes of 0 or 1, if a cert is not due for renewal it will return 2: https://github.com/acmesh-official/acme.sh/blob/5b0d6a13759987bada1715c17f014b60dfd3c358/acme.sh#L87

There is, however, a corner case I think: let's say a cert is bound for renewal, but due to rate limiting we fail to do so (maybe the user was testing with some other subdomain that now counts towards that?). The operation will fail and the container won't start. This could be improved with your check, by checking that the cert is still valid, and the error can be skipped, since the renewal cron job will eventually run and properly renew it.

the entire concept of checking those le certs or static cert.* does not work, if the config is freshly regenerated. The supplied fix does not handle this part, the admin's updatescript needs to handle this.

The LE rate limit section in the actual selfhosting guide gives a good impression on the problem, but should mention this explicit as a problem for the update process. Example for static keys:

docker-compose down git pull rm -fr ~/.jitsi-meet-cfg/ mkdir -p ~/.jitsi-meet-cfg/{web/letsencrypt,web/keys,transcripts,prosody,jicofo,jvb,jigasi,jibri} cp ~/jitsi-meet-certs/cert.* ~/.jitsi-meet-cfg/web/keys/ # example docker-compose up -d

Not sure I understand the problem. When manual keys are used (something not really documented, on purpose) the user is responsible for managing such keys. There is not much we can do.

If you are already creating your keys elsewhere the recommendation has always been to do your own proxy and TLS offloading, and reach this setup via HTTP.

saghul avatar Oct 13 '21 09:10 saghul

Your right, that it's not your concern if someone uses a static cert, because he wants to skip LE or has a commercial cert, but mentioning problems one will run into if one uses a feature "you" implemented, is always a good idea ;)

The LE auto-create feature was really a cool way to start with jitsi-meet, until the LE problems started. I just thought about the LE update process and maybe "0" seconds until recreating the cert maybe a bit short.

"-checkend 86400" may be more suited. On the other hand, if the patch is combined with the return of acme.sh rc 0/1, it would be possible to skip the "creation fail" exit, if the "previouse certs" have been copied aside, are still valid and can be reused again. In this, or the failure at all, case, a mail could be created to inform the admin ( LETSENCRYPT_EMAIL ) about the auto-create failure . This would really improve the situation.

From my experience yesterday, I also think, it's a good idea to log which certs are used on startup, so it's easier to debug things.

I would like to test and rewrite the scripts, but i have my doubts, that I'm able to recreate the docker image ( never done it ) and pimp it, as my base os is Fedora, not Debian/Ubuntu. So installing new apps into this image could be a problem. Any suggestions?

Cyborgscode avatar Oct 13 '21 10:10 Cyborgscode