[Feature] multiple replicas of headscale instances
Use case
currently, there is no option for running headscale in a high availability way. if one goes down, the whole tailnet is unreachable.
Description
adding an option for using multiple headscale servers to distribute the load as well as making sure every server syncs the config so the tailnet is highly available. are there any plans for such a feature?
Contribution
- [ ] I can write the design doc for this feature
- [ ] I can contribute this feature
How can it be implemented?
No response
I was able to run an HA setup of headscale previously it was in a kubernetes environment but it should be possible to replicate outside it as well
- use external postgres DB
- some load balancer with support for sticky sessions in front of headscale
That said I don't think that the control plane going down should affect the data path immediately, unless some endpoints are changing while it's unavailable
Also there was an issue around the same issue https://github.com/juanfont/headscale/issues/100
Tailnet continue working if control server is down, you can't connect new nodes, but nodes which connected continue to work.
You can use external DB and
- round Robin DNS - assign a few IP for your domain, not sure, but may to work (free, need a domain)
- dynDNS (for home or SMB) - IP for domain will be changed by API request (cheap or free; need custom script; have a time lag, while DNS will be re-cached, but for dyn record you can set ttl 60, then max control plane downtime will be 60 sec)
- virtual IP (for SMB/business) - the IP can be assign dynamically, between few nodes (expensive, you need to pay for IP per month ~$5-$15 and the bandwidth will be limited, but I believe more than enough for Headscale, if you will not use it as relay)
- external gateway (like Cloudflare) - it can rote all requests to your server by many options. It have Cloudflared, a tunnel to your backends, if needed. (free, need a domain)
Anyway it could be great to have embedded HA mechanism. I believe it is not a rocket science, just provide a list of backends for client (this update need apply to client) and try to connect one by one.
@tiberiuv @x1arch do you guys use headplane with multiple replicas >= 2?
Nope, but I don't see any problem in all of this variants, because in every moment will work only one headscale control server
PS. even if that have a problem with run two controls in one time, you can fix it by the script which will check your first server ans starts only if the fist one is down
Tailnet continue working if control server is down, you can't connect new nodes, but nodes which connected continue to work.
You can use external DB and
- round Robin DNS - assign a few IP for your domain, not sure, but may to work (free, need a domain)
- dynDNS (for home or SMB) - IP for domain will be changed by API request (cheap or free; need custom script; have a time lag, while DNS will be re-cached, but for dyn record you can set ttl 60, then max control plane downtime will be 60 sec)
- virtual IP (for SMB/business) - the IP can be assign dynamically, between few nodes (expensive, you need to pay for IP per month ~$5-$15 and the bandwidth will be limited, but I believe more than enough for Headscale, if you will not use it as relay)
- external gateway (like Cloudflare) - it can rote all requests to your server by many options. It have Cloudflared, a tunnel to your backends, if needed. (free, need a domain)
Anyway it could be great to have embedded HA mechanism. I believe it is not a rocket science, just provide a list of backends for client (this update need apply to client) and try to connect one by one.
The whole state resides in the DB? Are there any other config files, apart from the main config.yaml which needs to be accounted for?
Nope, but I don't see any problem in all of this variants, because in every moment will work only one headscale control server
PS. even if that have a problem with run two controls in one time, you can fix it by the script which will check your first server ans starts only if the fist one is down
I personally lose connection to tailnet when headscale server goes down. I don't know how your peers still stay connected and are able to communicate. It's a direct connection, no derp
I personally lose connection to tailnet when headscale server goes down. I don't know how your peers still stay connected and are able to communicate. It's a direct connection, no derp
It's really weird, because, the nodes has direct connections between each other, when my headscale goes down, direct connections continues to work, not all, but most of them, I was sure the problem was in DERP.
Right now I stopped my headscale server and uptime monitor continues to ping all hosts without any problems, the phone - too. Headscale - is only management server.
I personally lose connection to tailnet when headscale server goes down. I don't know how your peers still stay connected and are able to communicate. It's a direct connection, no derp
It's really weird, because, the nodes has direct connections between each other, when my headscale goes down, direct connections continues to work, not all, but most of them, I was sure the problem was in DERP.
Right now I stopped my headscale server and uptime monitor continues to ping all hosts without any problems, the phone - too. Headscale - is only management server.
Ok, you might be right. Some peers stay online, while very few goes off. Even tho all of them are connected directly.
adding an option for using multiple headscale servers to distribute the load as well as making sure every server syncs the config so the tailnet is highly available. are there any plans for such a feature?
The short answer is no. But I will break it down a little more and add some reasoning.
making sure every server syncs
This quickly complicates the server by potentially order of magnitude making simple bugs hard to debug, and harder bugs even harder. It is feasible for some sort of systems and particularly if you have a lot of developers, but as I see it, the net gain for a project like this, it just does not make sense.
In addition, it significantly complicates your runtime setup, you now need to ensure you have multiple replicas, databases, there might be split brain issues, recovery becomes harder.
What I think people should focus on in their strategy is:
The Tailscale client has a lot of redundancy built in, as mentioned above, in general if the client has an up to date map, everything continue to work as long not too many nodes move at the same time. Technically if one side of the nodes move, it could still work as one node can reach the other to establish the connection. This has been built in since the beginning and I believe Tailscale runs quite a simple setup themselves where this is key part of the strategy. That said, there might be differences between Headscale and Tailscale here, where we do not implement everything correctly so all of these things dont work, and of course we should continue to improve that.
To the last point, this means that 5-10 minute, even an hour outage should not be noticed to much as long as your not "changing the shape of your network" so too much movement or new nodes etc. To this I will say that instead of a HA setup, you can easier focus on a much simpler "how quickly can I recover or replace my server?" setup.
Minimum setup for this should allow you to recover your Headscale from a backup in minutes:
- DNS or Virtual IP, point to a new VM
- Restore SQLite database and config from backup
- Install Headscale and start it up
you can backup sqlite easily / restore on start with https://litestream.io/reference/ or something like this project https://github.com/reneleonhardt/harmonylite ( to replicate sqlite via nats.io )
there's also dqlite and rqlite. I think rqlite does not need code changes and can be used as a drop-in replacement
adding an option for using multiple headscale servers to distribute the load as well as making sure every server syncs the config so the tailnet is highly available. are there any plans for such a feature?
The short answer is no. But I will break it down a little more and add some reasoning.
making sure every server syncs
This quickly complicates the server by potentially order of magnitude making simple bugs hard to debug, and harder bugs even harder. It is feasible for some sort of systems and particularly if you have a lot of developers, but as I see it, the net gain for a project like this, it just does not make sense.
In addition, it significantly complicates your runtime setup, you now need to ensure you have multiple replicas, databases, there might be split brain issues, recovery becomes harder.
What I think people should focus on in their strategy is:
The Tailscale client has a lot of redundancy built in, as mentioned above, in general if the client has an up to date map, everything continue to work as long not too many nodes move at the same time. Technically if one side of the nodes move, it could still work as one node can reach the other to establish the connection. This has been built in since the beginning and I believe Tailscale runs quite a simple setup themselves where this is key part of the strategy. That said, there might be differences between Headscale and Tailscale here, where we do not implement everything correctly so all of these things dont work, and of course we should continue to improve that.
To the last point, this means that 5-10 minute, even an hour outage should not be noticed to much as long as your not "changing the shape of your network" so too much movement or new nodes etc. To this I will say that instead of a HA setup, you can easier focus on a much simpler "how quickly can I recover or replace my server?" setup.
Minimum setup for this should allow you to recover your Headscale from a backup in minutes:
- DNS or Virtual IP, point to a new VM
- Restore SQLite database and config from backup
- Install Headscale and start it up
I've been looking for a solution to the high availability problem for a while over the weekend. I write how to create auto replication of sqlite and auto failover switch to replica node All I posted in my blog: https://gawsoft.com/blog/headscale-litefs-consul-replication-failover/
@kradalby I'm a bit confused by comments here and the general decision to support Postgres in a way that's overtly precarious (maintenance mode). I've been prototyping for a good size project the last few weeks, one that would almost certainly lead to in/direct support—Headscale is the obvious place/base to start from and early patches would probably be scalability related—but it becomes a significantly tougher sell to both my own good conscious and the wider engineering org when I have to explain only SQLite is supported (great DB but quite challenging in stateless/ephemeral environments), and also, there's little interest in HA too.
I guess what I'm trying to say is, this approach might be leaving money/time/skills/contributors/??? on the table. I 100% understand and respect the desire to Keep It Simple; alas, if I e.g. say out loud in a tech review that "the Headscale control plane could be down for minute(s) plural and this is an acceptable outcome because there is little appetite for fixing this upstream", Headscale is likely to get (reasonably?) dismissed, and Nice Things never happen.
Since I still feel that it's probably the best base for me to start from, what would it take to solidify your thinking around the handful of important things wrt improving core scalability? In my mind, the two most important pieces are full support for externalized database state and the ability to run multiple copies without much fuss.
I also understand there's likely to be other issues around rebuilding the "world map", and I'd hoped to help cross that bridge when the time comes; my fear is I'll not be allowed to cross said bridge because there will be no interest in what I have to offer.
What about reconsidering distributed SQLite flavours, earlier proposed by @dzervas?
The dqlite-fork cowsql is happily in use for production use cases, e.g. by the Golang Incus project.
I will also note that this is an important feature for me. Regardless of other conditions, it turns out that Headscale, in its current form, is a tool for hobbies, not for production solutions, which is sad. (Or you have to use homemade solutions or accept the risks.)
I recall Kafka, Postgres, and even Kubernetes itself, which successfully solve the task of high availability. Maybe it's worth looking for some elegant and simple approaches in the open-source community?
Even I studied how to use leases for my experiments with operators in k8s, and it doesn't look like rocket science... for k8s installation, of course, but without external database support from Headscale itself, it's unlikely to implement a good "quick and dirty" solution.