tailscale
tailscale copied to clipboard
FR: Cache tailscale cert and key data to kube state secret
What are you trying to do?
- When obtaining a cert via socket integration (E.g. Caddy) certs are stored in the state dir. Based on the default run.sh this is /tmp.
- When the pod is recycled the state dir is emptied and cert is requested again.
- If this happens too often it hits the LetsEncrypt limit for that name.
How should we solve this?
- It would be great if the certificate and key data was cached in the kube state/secret similar to the node key so it can write the latest cert data here and check and use this before it goes off and requests again.
What is the impact of not solving this?
- Having to change the machine name.
- Creating a custom caching mechanism outside of tailscaled.
Anything else?
No response
Where's your node key secret being written to? Which Tailscale state storage are you using?
Where's your node key secret being written to? Which Tailscale state storage are you using?
name = "TS_KUBE_SECRET"
value = "tailscale-state"
@bradfitz the default one being passed through for tailscale container on start via env TS_KUBE_SECRET=tailscale-state.
tailscaled starts with --state=kube:tailscale-state when used with the default run.sh
Upon checking the kubernetes secret that it's writing to:
data:
_daemon: REDACTED
_machinekey: REDACTED
_nl-node-key: REDACTED
I wonder if certs could be cached here too?
@bradfitz this is the main issue when not cached.
If for whatever reason something goes wrong with the workload and it causes succession of requests without checking for cache this happens:
022/10/11 23:26:42 cert("REDACTED.REDACTED.ts.net"): getCertPEM: 429 urn:ietf:params:acme:error:rateLimited: Error creating new order :: too many certificates (5) already issued for this exact set of domains in the last 168 hours: REDACTED.REDACTED.ts.net, retry after 2022-10-12T17:32:02Z: see https://letsencrypt.org/docs/duplicate-certificate-limit/
Which means we need to force a hostname change to get it going again. And also running into this:
2022/10/12 01:09:34 cert("REDACTED.REDACTED.ts.net"): getCertPEM: acme.Register: 429 urn:ietf:params:acme:error:rateLimited: Error creating new account :: too many registrations for this IP: see https://letsencrypt.org/docs/too-many-registrations-for-this-ip/
Any ideas on managing this with Tailscale whilst this is support would be much appreciated.
In case this helps anyone else running into this on a k8s cluster.
Using the bitnami/kubectl
image...
Create a sidecar that watches for cert changes and write them to a secret.
# Get base64 of md5sums of cert files or whatever is returned
oldsum=$(md5sum ${cert_path}/* 2>&1 | base64)
while true; do
sleep 10
# Get the new md5sums
newsum=$(md5sum ${cert_path}/* 2>&1 | base64)
# If they're the same continue
[[ "$oldsum" == "$newsum" ]] && continue
# Otherwise set old md5sum to this and update the secret
oldsum=$newsum
kubectl create secret generic tailscale-cert \
--save-config \
--dry-run=client \
--from-file=${cert_path}/acme-account.key.pem \
--from-file=${cert_path}/${domain}.key \
--from-file=${cert_path}/${domain}.crt \
-o yaml | kubectl apply -f - || :
done
Init container which loops through the files in the secret and writes them to tailscale cert path:
mkdir -p ${cert_path}
# Get the file data as pipe delimited key value
lines=$(kubectl get secret tailscale-cert -o jsonpath="{.data}" | jq -r 'keys[] as $k | "\($k)|\(.[$k])"')
# Loop through the lines and write the data to their original paths
for l in $lines; do
data=$${l#*|}
file=$${l%%|*}
echo $data | base64 -d > ${cert_path}/$file || :
done
Similar issue: tsnet needs to store certificates in memory: https://github.com/tailscale/tailscale/issues/4597