alpine-traefik icon indicating copy to clipboard operation
alpine-traefik copied to clipboard

Backing up acme certs and making persistent

Open joshuacox opened this issue 7 years ago • 19 comments

So I have hit the dreaded:

Error 429 - urn:acme:error:rateLimited - Error creating new cert :: too many certificates already issued for exact set of domains:

Because I have restarted traefik too many times this week on a few of my domains. So I decided to dig in and find the certs. @rawmind0 please correct me if I'm mistaken on any of this. But it looks like all the certs are stored in acme.json which is in the /opt/traefik/acme directory.

My question is can I volume mount in this directory? Perhaps using convoy-NFS? I'm going to give it a shot, but I'd welcome any comments or suggestions?

here's an example in PR form, feel free to reject. I'm going to test this out on a test environment.

joshuacox avatar Apr 30 '17 13:04 joshuacox

ok the v1.2.3-rancher1-8 is a working version, I had a few issues along the way that I think were unrelated to traefik ( i had an old environment in which health checks had started misbehaving amongst a few other issues). After creating a fresh envrionment this seems to be working now, testing further. I'd welcome any comments or suggestions.

joshuacox avatar May 02 '17 17:05 joshuacox

@rawmind0 do you think it will be all right if I am just mounting in the /opt/traefik/acme directory as a volume?

And if that is all right, what do you think about it being shared to many alpine traefik containers spawned on many front-end hosts with public IP addresses talking to various clustered services in the backend?

I think the obvious first worry is multiple instances trying to open the acme.json at once.

Right now I'm getting pretty good results. i.e. I've got enforced SSL on a domain, and restarting the container does not get me rate-limit banned at letsencrypt. But I only have one alpine-traefik front end running using that particular volume.

joshuacox avatar May 02 '17 17:05 joshuacox

/opt/traefik/acme should be a volume. I ran into rate-limit after upgrading to v1.2.3-rancher1

mcnilz avatar May 03 '17 13:05 mcnilz

Hey @joshuacox ... Obviously, to use traefik with acme enabled in production, a better persistence solution for ssl certs is needed.

Sharing /opt/traefik/acme doesn't seem a good idea, due to traefik need write access to acme.json and the file will get corrupted, eventually. IMHO, bad direction.

At this point, i see two ways, use a shared volume or distributed key/value:

  • one would be try to connect traefik to a distributed key/value and configure it in HA mode https://docs.traefik.io/user-guide/kv-config/
  • another would be use the @janeczku letsencrypt rancher integration and use it with traefik instead the built-in acme integration.

What do you think??

rawmind0 avatar May 03 '17 17:05 rawmind0

@rawmind0 I kind of like the idea of using the kv store, your choice on which one consul, etcd, zookeper, boltdb. I've got more experience with etcd and consul, but I'm open to the other two as well.

@mcnilz I have a fork with a branch here that can be added to a rancher and you can have /opt/traefik/acme as a volume.

I'd love feedback of any sort as I'm already using this method in a 'production' instance, for better or worse (much better now that I have acme.json persistent and not hitting the rate limits).

joshuacox avatar May 03 '17 18:05 joshuacox

@joshuacox , which kv store is not a problem, which ever could work just fine... I guess that all are already on the rancher community catalog... :)

I've took a look at your-catalog and i think that it shouldn't work. Don't you have file permission issues?? Due to traefik is running with user traefik UID 10001 and volume should be root owned....

Anyway, just as advice, don't add the volume directly to the traefik container, is a bad practice. I think is much better provide the volume through another container and add it to traefik volumes_from section.... ;)

Code for traefik.....(also add traefik-acme to traefik volumes_from section)

traefik-acme:
  net: none
  labels:
    io.rancher.scheduler.affinity:container_label_soft_ne: io.rancher.stack_service.name=$${stack_name}/$${service_name}
    io.rancher.container.hostname_override: container_name
    io.rancher.container.start_once: true
  environment:
    - SERVICE_UID=10001
    - SERVICE_GID=10001
    - SERVICE_VOLUME=/opt/traefik/acme
  volumes:
    - /opt/traefik/acme
  volume_driver: ${VOLUME_DRIVER}
  image: rawmind/alpine-volume:0.0.2-1

rawmind/alpine-volume, is a container just to do this kind of thinks easily. It creates the volume and set $SERVICE_UID as owner....

rawmind0 avatar May 03 '17 19:05 rawmind0

I've published a version with the optional acme volume from another container at my repo... I've used go templating in docker-compose.yml.tpl for create the traefik volume only if enable acme is true...

Take a look.. :)

https://github.com/rawmind0/service-catalog

rawmind0 avatar May 03 '17 19:05 rawmind0

Merged into community catalog.... https://github.com/rancher/community-catalog/pull/500

rawmind0 avatar May 03 '17 20:05 rawmind0

KV is the best option for production, but its nice to have volume for development.

I tried https://github.com/rawmind0/service-catalog but its not creating any service. I think the reason is that I am still with rancher 1.3.5 (never change a running system) I manually added the volume to the traefik-conf sidekick.

mcnilz avatar May 04 '17 06:05 mcnilz

Hey folks,

sorry for waking this old issue up again.

I ran into the same issue like @joshuacox and I also would like to setup a kv-store for traefik. But tbh, I don't get how to configure the Traefik from rancher community catalog along with a kv-storage like consul or zookeeper. Traefik-Docs say that I have to edit the .toml for Traefik. But how am I supposed to do that with the "rancher catalog traefik"?

Could you guys give me a hint on how to do it?

Thank you so much!

Hermsi1337 avatar Nov 04 '17 05:11 Hermsi1337

@Hermsi1337 if you enable acme then the community catalog traefik will use the VOLUME_NAME and VOLUME_DRIVER variable to use a file based store with acme file.

I've been using this for months with the rancher-nfs driver and a pair of traefik frontends (*it should be noted that if you are using many frontends you definitely need to move up to consul or zookeeper).

I have a catalog where the defaults are exactly how I deploy on my personal cluster. I just updated that to use an enum to allow you to choose the traefik version from @rawmind0 's dockerhub. I'm going to test 1.4.1-2 now, I've been using 1.3.6 up until this point.

Notice the catalog item is deprecated for 2.0 as the whole stack is retooling for kubernetes.

This includes the kubernetes version (in fact anything >1.39 for the kubernetes version)

So it might be best at this point to look at the helm template. I'd love to hear @rawmind0 opinion on the subject.

joshuacox avatar Nov 04 '17 14:11 joshuacox

Hey @joshuacox ,

many thanks for your answer.

I also thought about using NFS for sharing the certificates between my nodes. But I am worried about the security. Since sending data via NFS over the internet is not secure at all. I tried to setup NFS with Kerberos.. but somehow Rancher is not support Kerberos NFS mounts. How did you manage a "secure" file-transfer via NFS?

Actually I don't want to move to Kubernetes since it is a bit oversized for my needs. Therefore it would be great if I cloud implement KV using Cattle.

I hope, that @rawmind0 has an idea.

Hermsi1337 avatar Nov 06 '17 10:11 Hermsi1337

Hi guys,

@joshuacox, Rancher v2.0 is in alpha state, and eventually catalog could be eventually outdated but not deprecated, we'll update all packages to this version. In rancher v2.0 you could use helm packages if you want or like, but you still could use rancher catalog packages using compose style.

@Hermsi1337, if you want to share certificates, you could use nfs (not recommended for wan environments) but best traefik approach would be use a k/v store and configure it in ha mode. By the moment, this package doesn't support this configuration, just nfs. https://docs.traefik.io/configuration/acme/ https://docs.traefik.io/user-guide/cluster/

As an alternative, you could disable traefik acme support and use rancher "let's encrypt" integration published in the community-catalog https://github.com/rancher/community-catalog/tree/master/templates/letsencrypt https://www.digitalocean.com/community/tutorials/how-to-secure-your-rancher-web-app-with-let-s-encrypt-on-ubuntu-16-04

Best regards...

rawmind0 avatar Nov 08 '17 10:11 rawmind0

@rawmind0 What is the difference when using "let's encrypt" integration from community-catalog?

As far as I can see, there is also no support for kv. I'm only able to use the regular storage drivers like nfs and so on.

Am I missing a point?

Hermsi1337 avatar Nov 08 '17 13:11 Hermsi1337

@hermsi1337, the main difference is that with rancher integration, all services could get letsencrypt certificates, not just traefik.

I already wrote in my previous comment, that this package doesn't support k/v configuration by the moment, just nfs

rawmind0 avatar Nov 08 '17 13:11 rawmind0

@rawmind0, okay.. I think I will give that configuration a shot.

Nevertheless I would really love to use your traefik-setup along with kv-store. Do you have any plans on implementing stuff like that?

Hermsi1337 avatar Nov 08 '17 17:11 Hermsi1337

So busy....Pull requests are very welcomed.. ;)

rawmind0 avatar Nov 08 '17 17:11 rawmind0

@rawmind0 how do I make the rancher letsencrypt stack from the community catalog work with this?

jonahlau avatar Jun 01 '18 19:06 jonahlau

@jonahlau to make letsencrypt stack work with this, traefik acme support needs to be disabled. Take a look to https://www.digitalocean.com/community/tutorials/how-to-secure-your-rancher-web-app-with-let-s-encrypt-on-ubuntu-16-04

rawmind0 avatar Jul 31 '18 15:07 rawmind0