acme-dns icon indicating copy to clipboard operation
acme-dns copied to clipboard

HA Configuration

Open jwomackgsa opened this issue 4 years ago • 18 comments

Has anyone run acme-dns in a highly available config using the postgres DB? Before I go testing myself, I was just wondering if anyone had multiple instances of acme-dns running against the same PG db without issues?

jwomackgsa avatar May 05 '21 14:05 jwomackgsa

Yup, just did this and seems to work just fine.

laingsc avatar Sep 08 '21 00:09 laingsc

Same here. Running two instances using a postgresql cluster backend with reverse proxy in front for http load-balancing. Authenticate/Update using one domain (acme.example.com) and serve dns records using acme-dns.example.com. NS records point to the each server. Works great!

records = [
    # specify that each server will resolve any *.acme-dns.example.com records
    "acme-dns.example.com. NS acme-1.example.com.",
    "acme-dns.example.com. NS acme-2.example.com.",
]

Note that I removed the A record from the example config as I'm using a separate name pointing at the WebProxy for that. The WebProxy requires authentication for registration requests. Had to tweak the python script for certbot a little but it wasn't too bad.

JonathanATyler avatar Sep 16 '21 03:09 JonathanATyler

@JonathanATyler Unfortunately, when creating my HA setup, I have the problem that each instance acts individually and the initial acme record is set in all instances individually and it comes to an error when reading the data. Could you please share your configuration?

ZPascal avatar Dec 01 '21 08:12 ZPascal

@ZPascal I can support your observations. The acme-dns service seems to load all txt records in the database when it is first started, but does not add new ones, which were added by another instance while the first instance is still running.

After the first instance is restarted, both instances are serving the same records again.

@JonathanATyler any chance of sharing your configuration with us?

p3l1 avatar Dec 07 '21 14:12 p3l1

Hi @ZPascal, @p3l1

Sorry for the delay. Below is my config, I have not experienced that issue myself thus far, but I haven't thoroughly tested it either as I haven't had any issue with getting my certs. I'm not sure exactly which version I'm using either, probably whatever was available in Sept.

As a side note, I'm actually thinking of setting up a few more instances, just to do the http/api side behind my DMZ (where it's safer), with DNS only on DMZ side. So that might give me some more insight with regards to the issues you're seeing, given that the DNS side won't actually be updating records directly. I will also have a look at postgresql logs to see if queries are actually going through, when I have time.

[general]
listen = "0.0.0.0:53"
protocol = "both"
domain = "acme-dns.example.com"
nsname = "acme-dns.example.com"
nsadmin = "[email protected]"
records = [
    "acme-dns.example.com. NS acme1.example.com.",
    "acme-dns.example.com. NS acme2.example.com.",
]
debug = false

[database]
engine = "postgres"
connection = "postgres://acme-dns:p@$$word@<postgres-server>/acme-dns"

[api]
ip = "0.0.0.0"
disable_registration = false
port = "8080"
tls = "none"
acme_cache_dir = "api-certs"
corsorigins = [
    "*"
]
use_header = true
header_name = "X-Forwarded-For"

[logconfig]
loglevel = "debug"
logtype = "stdout"
logformat = "text"

JonathanATyler avatar Dec 07 '21 15:12 JonathanATyler

@JonathanATyler Are you using TLS in production? When using two different acme-dns Server the automatic certificate creation is not working correctly, because the challenge may be answered by the wrong server in my current setup.

Any ideas on how to solve this issue?

I am using the following DNS Configuration:

domain = "acme.customer.example.org"
nsname = "acme.customer.example.org"

records = [
    "acme01.customer.example.org. A 1.1.1.1",
    "acme02.customer.example.org. A 2.2.2.2",
    "acme.customer.example.org. A 1.1.1.1",
    "acme.customer.example.org. A 2.2.2.2",
    "acme.customer.example.org. NS acme01.customer.example.org",
    "acme.customer.example.org. NS acme02.customer.example.org",
]

p3l1 avatar Dec 13 '21 09:12 p3l1

@p3l1 I too had trouble getting auto-cert to work in that regard. This is all in a HomeLab at the moment, so I don't really worry about https internally. I use a reverse proxy to handle TLS of the web traffic, and forward http to the ACME-DNS servers on port 8080 (no TLS). Theoretically you can try to request a cert through the proxy for acme,acme01,acme02 and push it to the ACME-DNS servers - to the path set in config (below). I may revisit this when I have time to see if I can do all this without a proxy as it would remove the need for additional auth tweaking needed when requesting certs, but that won't be for a while.

tls = "cert"
tls_cert_privkey = "/etc/acme-dns/privkey.pem"
tls_cert_fullchain = "/etc/acme-dns/fullchain.pem"

All that said, I'm not really actively using this anymore anyway as all my domains are hosted at Linode and I use their ACME-DNS API for most of my requests now so I don't have to manually update DNS records on parent domains.

JonathanATyler avatar Dec 13 '21 14:12 JonathanATyler

@JonathanATyler Alright, i am going to ditch the second instance for now. Due to the fact the system is not affecting the acme-dns clients directly, there shouldn't be problem when the service is offline for a few hours. As long as the database is stored savely and a recovery can be made quickly.

I will pick up on the idea to get a certificate by using the reverse proxy though.

Thanks for your support :)

p3l1 avatar Dec 16 '21 10:12 p3l1

Hi @p3l1 , I've successfully set up a HA based setup of the ACME DNS server. I created a graphic, to describe my corresponding setup.

Basic setup:

  • 3 ACME DNS instances
  • DNS TCP 53 and UDP53 port opened (firewall protected)
  • API port 8443 opened (firewall protected)
  • Self signed API certs certs between the load balancer and the API endpoints
  • PostgreSQL as database
  • Apache load balancer working as central API endpoint
  • Let's encypt cert for the load balancer
  • IP based firewall and basic auth for the API load balancer endpoint
  • Shared glusterfs storage folder between the ACME DNS instances to share the certs

ACME-DNS-Server

ACME Configuration:

[general]
# DNS interface. Note that systemd-resolved may reserve port 53 on 127.0.0.53
# In this case acme-dns will error out and you will need to define the listening interface
# for example: listen = "127.0.0.1:53"
listen = "0.0.0.0:53"
# protocol, "both", "both4", "both6", "udp", "udp4", "udp6" or "tcp", "tcp4", "tcp6"
protocol = "both4"
# domain name to serve the requests off of
domain = "test.com"
# zone name server
nsname = "dns1.test.com,dns2.test.com,dns3.test.com"
# admin email address, where @ is substituted with .
nsadmin = "webmaster.test.com"
# predefined records served in addition to the TXT
records = [
    "test.com. A X.X.X.X",
    "acme.test.com. A X.X.X.X",

    "test.com. NS dns1.test.com.",
    "test.com. NS dns2.test.com.",
    "test.com. NS dns3.test.com.",
]
# debug messages from CORS etc
debug = false

[database]
# Database engine to use, sqlite3 or postgres
engine = "postgres"
connection = "postgres://test:[email protected]:5432/acme"

[api]
# listen ip eg. 127.0.0.1
ip = "144.91.86.56"
# disable registration endpoint
disable_registration = false
# listen port, eg. 443 for default HTTPS
port = "8443"
# possible values: "letsencrypt", "letsencryptstaging", "cert", "none"
tls = "cert"
tls_cert_privkey = "/home/acme-dns/certs/server.key"
tls_cert_fullchain = "/home/acme-dns/certs/server.crt"
# only used if tls = "letsencrypt"
#acme_cache_dir = "api-certs"
# optional e-mail address to which Let's Encrypt will send expiration notices for the API's cert
notification_email = "[email protected]"
# CORS AllowOrigins, wildcards can be used
corsorigins = []
# use HTTP header to get the client ip
use_header = true
# header name to pull the ip address / list of ip addresses from
header_name = "X-Forwarded-For"

[logconfig]
loglevel = "error"
logtype = "stdout"
logformat = "text"

Apache load balancer Configuration:

<VirtualHost *:80>
        ServerName acme.test.com
        ServerAdmin root@localhost
        DocumentRoot /var/www/html

        <Proxy balancer://cluster>
                BalancerMember https://X.X.X.X:8443
                BalancerMember https://Y.Y.Y.Y:8443
                BalancerMember https://Z.Z.Z.Z:8443
                ProxySet lbmethod=byrequests
        </Proxy>

        SSLProxyEngine on
        SSLProxyCACertificateFile /home/acme-dns/certs/ca.crt
        SSLProxyCheckPeerCN off

        <Location "/">
                deny from all
                allow from X.X.X.X
                allow from Y.Y.Y.Y
                allow from Z.Z.Z.Z

                AuthType Basic
                AuthName "ACME protection"
                AuthUserFile /usr/test/acme/.htpasswd
                require valid-user
        </Location>
        
        ProxyPass / balancer://cluster/
        ProxyPassReverse / balancer://cluster/
        
        ErrorLog /var/log/apache2/acmetest.log
        LogLevel warn
        CustomLog /var/log/apache2/acmetest.log combined
        ServerSignature Off
</VirtualHost>

I hope that helps and solves your problem. Feel free to contact me, if you need further details.

ZPascal avatar Jan 03 '22 22:01 ZPascal

@ZPascal nice setup, glad you were able to sort it out, and thanks for sharing it :) When you say "Shared glusterfs storage folder between the ACME DNS instances to share the certs" is that just for the Self-Signed cert, or acme-dns data. If it's for acme-dns data (so they are all in sync when requesting certs) what path are you "glustering"?

Cheers!

JonathanATyler avatar Jan 04 '22 01:01 JonathanATyler

Hi @JonathanATyler Thx :)

What do mean with acme-dns data, the configuration file? I shared the complete /home/acme-dns folder as glusterfs volume and the configuration of the acme instances is outside the folder.

ZPascal avatar Jan 04 '22 07:01 ZPascal

@ZPascal In a previous message it was said "The acme-dns service seems to load all txt records in the database when it is first started, but does not add new ones, which were added by another instance while the first instance is still running". If the data is being stored on local disk first I wondered if your setup accounted for that by using glusterfs. Though it's more likely being stored in memory first and flushed to DB, but not read back from it unless restarted. Have you found that to still be the case? Or were you able to get all of them to resolve the same data across the cluster?

JonathanATyler avatar Jan 04 '22 13:01 JonathanATyler

@ZPascal What kind of ACME Client are you using for Basic Authentification? Does acme.sh have support for it?

p3l1 avatar Jan 05 '22 14:01 p3l1

@p3l1 I've used the python implementation and modified it. I've opened a gist to share my modifications. I think you mean the acme.sh script? With some modifications, it should be possible to include the functionality inside the script.

@JonathanATyler I will check and test that, and I'll post an answer in few days.

ZPascal avatar Jan 05 '22 21:01 ZPascal

@ZPascal This is great! Thanks for sharing your implementation with us 😄

p3l1 avatar Jan 06 '22 13:01 p3l1

I am planning to add Basic Authentification to the official certbot-dns-acmedns plugin, so it can be used directly inside nginx-proxy-manager, which already has the current version of the acme-dns plugin implemented.

https://github.com/pan-net-security/certbot-dns-acmedns/issues/2

p3l1 avatar Jan 07 '22 09:01 p3l1

@JonathanATyler Sorry for the late reply. I could not see any problems with resolving the entries. In my test case, I created 3 TXT entries in parallel and these were written to the database and the ACME client could continue without issues. If you have further questions, feel free to contact me!

ZPascal avatar Feb 07 '22 07:02 ZPascal

@ZPascal No worries, that was my experience as well. Thanks for confirming.

JonathanATyler avatar Feb 07 '22 14:02 JonathanATyler