consul-template
consul-template copied to clipboard
Simultaneous SSL update on all instances.
Hi!
We use consul-template + Vault PKI to provide SSL certificates for the MySQL Galera cluster. We did some tests with short TTL (15m) for SSL and faced the issue when the Galera cluster crashed because of simultaneous SSL re-generation for all nodes (we send ALTER INSTANCE RELOAD TLS;
via reload-script each time a new certificate has been done).
Also, we faced the same issue with the Apache Kafka cluster (with SSL) but TTL was 7 days. Honestly, it was only once for 1 month, but it has happened.
We applied a fix to shift TTL for 1 day for every next node, it helps to reduce the chance, but it's not a fix.
My question is simple: Any way you have some distributive lock (via Consul) to prevent all instances from updating certificates at the same time?
mysqld config x 3 instatces
$ cat my.cnf
[client]
port = 33306
socket = /tmp/mysql.sock
default-character-set = utf8
[mysqld]
pxc_encrypt_cluster_traffic=ON
user = mysql
ssl-ca = /opt/mysql/tls/server/server-ca.pem
ssl-cert = /opt/mysql/tls/server/server-cert.pem
ssl-key = /opt/mysql/tls/server/server-key.pem
...
consul-template configs x 3 instances
$ cat conf/consul-template.hcl
vault {
address = "https://127.0.0.1:8200"
unwrap_token = false
renew_token = true
lease_renewal_threshold = 0.5
ssl {
enabled = true
verify = true
ca_path = "/opt/consul-template/tls/server-CA.cert"
cert = "/opt/consul-template/tls/consul-template.cert"
key = "/opt/consul-template/tls/consul-template.key"
server_name = "127.0.0.1"
}
}
# MYSQL
template {
source = "/opt/consul-template/templates/mysql/server-ca.pem.tpl"
destination = "/opt/mysql/tls/server/server-ca.pem"
perms = 0640
command = "/opt/consul-template/templates/mysql/reload.sh"
error_on_missing_key = true
left_delimiter = "[["
right_delimiter = "]]"
}
template {
source = "/opt/consul-template/templates/mysql/server-cert.pem.tpl"
destination = "/opt/mysql/tls/server/server-cert.pem"
perms = 0640
command = "/opt/consul-template/templates/mysql/reload.sh"
error_on_missing_key = true
left_delimiter = "[["
right_delimiter = "]]"
}
template {
source = "/opt/consul-template/templates/mysql/server-key.pem.tpl"
destination = "/opt/mysql/tls/server/server-key.pem"
perms = 0640
command = "/opt/consul-template/templates/mysql/reload.sh"
error_on_missing_key = true
left_delimiter = "[["
right_delimiter = "]]"
}
$ cat /opt/consul-template/templates/mysql/reload.sh
#!/bin/bash
set -eo pipefail
STATUS=0
if [ -f '/opt/mysql/current/bin/mysql' -a -S '/tmp/mysql.sock' ];
then
echo "ALTER INSTANCE RELOAD TLS;" | /opt/mysql/current/bin/mysql -u root -p'super-secure-password'
STATUS=$?
fi
exit $STATUS
Why not add sleep $((1 + $RANDOM % 360));
to your mysql/reload.sh
command? (obviously you can adjust the 360
seconds to whatever suits your needs. Your certificates will be rotated before they expire so you do not have to update right away when you generate the new ones.
Why not add
sleep $((1 + $RANDOM % 360));
to yourmysql/reload.sh
command? (obviously you can adjust the360
seconds to whatever suits your needs. Your certificates will be rotated before they expire so you do not have to update right away when you generate the new ones.
Thank you for your reply, @komapa. Yes, it's a possible solution, but it's can guarantee nothing. We did the same by adjusting every next node's TTL to one (hour, day), but it also won't protect us in the case of bad luck.
I think you pretty much said this?
command = "consul lock -child-exit-code /some/consul/path/prefix /opt/consul-template/templates/mysql/reload.sh"
See: https://developer.hashicorp.com/consul/commands/lock