headscale icon indicating copy to clipboard operation
headscale copied to clipboard

[Feature] multiple replicas of headscale instances

Open thebigbone opened this issue 9 months ago • 15 comments

Use case

currently, there is no option for running headscale in a high availability way. if one goes down, the whole tailnet is unreachable.

Description

adding an option for using multiple headscale servers to distribute the load as well as making sure every server syncs the config so the tailnet is highly available. are there any plans for such a feature?

Contribution

  • [ ] I can write the design doc for this feature
  • [ ] I can contribute this feature

How can it be implemented?

No response

thebigbone avatar Jul 18 '25 15:07 thebigbone

I was able to run an HA setup of headscale previously it was in a kubernetes environment but it should be possible to replicate outside it as well

  • use external postgres DB
  • some load balancer with support for sticky sessions in front of headscale

That said I don't think that the control plane going down should affect the data path immediately, unless some endpoints are changing while it's unavailable

Also there was an issue around the same issue https://github.com/juanfont/headscale/issues/100

tiberiuv avatar Jul 18 '25 19:07 tiberiuv

Tailnet continue working if control server is down, you can't connect new nodes, but nodes which connected continue to work.

You can use external DB and

  • round Robin DNS - assign a few IP for your domain, not sure, but may to work (free, need a domain)
  • dynDNS (for home or SMB) - IP for domain will be changed by API request (cheap or free; need custom script; have a time lag, while DNS will be re-cached, but for dyn record you can set ttl 60, then max control plane downtime will be 60 sec)
  • virtual IP (for SMB/business) - the IP can be assign dynamically, between few nodes (expensive, you need to pay for IP per month ~$5-$15 and the bandwidth will be limited, but I believe more than enough for Headscale, if you will not use it as relay)
  • external gateway (like Cloudflare) - it can rote all requests to your server by many options. It have Cloudflared, a tunnel to your backends, if needed. (free, need a domain)

Anyway it could be great to have embedded HA mechanism. I believe it is not a rocket science, just provide a list of backends for client (this update need apply to client) and try to connect one by one.

x1arch avatar Jul 19 '25 03:07 x1arch

@tiberiuv @x1arch do you guys use headplane with multiple replicas >= 2?

lucasfcnunes avatar Jul 19 '25 16:07 lucasfcnunes

Nope, but I don't see any problem in all of this variants, because in every moment will work only one headscale control server

PS. even if that have a problem with run two controls in one time, you can fix it by the script which will check your first server ans starts only if the fist one is down

x1arch avatar Jul 19 '25 18:07 x1arch

Tailnet continue working if control server is down, you can't connect new nodes, but nodes which connected continue to work.

You can use external DB and

  • round Robin DNS - assign a few IP for your domain, not sure, but may to work (free, need a domain)
  • dynDNS (for home or SMB) - IP for domain will be changed by API request (cheap or free; need custom script; have a time lag, while DNS will be re-cached, but for dyn record you can set ttl 60, then max control plane downtime will be 60 sec)
  • virtual IP (for SMB/business) - the IP can be assign dynamically, between few nodes (expensive, you need to pay for IP per month ~$5-$15 and the bandwidth will be limited, but I believe more than enough for Headscale, if you will not use it as relay)
  • external gateway (like Cloudflare) - it can rote all requests to your server by many options. It have Cloudflared, a tunnel to your backends, if needed. (free, need a domain)

Anyway it could be great to have embedded HA mechanism. I believe it is not a rocket science, just provide a list of backends for client (this update need apply to client) and try to connect one by one.

The whole state resides in the DB? Are there any other config files, apart from the main config.yaml which needs to be accounted for?

thebigbone avatar Jul 20 '25 05:07 thebigbone

Nope, but I don't see any problem in all of this variants, because in every moment will work only one headscale control server

PS. even if that have a problem with run two controls in one time, you can fix it by the script which will check your first server ans starts only if the fist one is down

I personally lose connection to tailnet when headscale server goes down. I don't know how your peers still stay connected and are able to communicate. It's a direct connection, no derp

thebigbone avatar Jul 20 '25 05:07 thebigbone

I personally lose connection to tailnet when headscale server goes down. I don't know how your peers still stay connected and are able to communicate. It's a direct connection, no derp

It's really weird, because, the nodes has direct connections between each other, when my headscale goes down, direct connections continues to work, not all, but most of them, I was sure the problem was in DERP.

Right now I stopped my headscale server and uptime monitor continues to ping all hosts without any problems, the phone - too. Headscale - is only management server.

x1arch avatar Jul 20 '25 18:07 x1arch

I personally lose connection to tailnet when headscale server goes down. I don't know how your peers still stay connected and are able to communicate. It's a direct connection, no derp

It's really weird, because, the nodes has direct connections between each other, when my headscale goes down, direct connections continues to work, not all, but most of them, I was sure the problem was in DERP.

Right now I stopped my headscale server and uptime monitor continues to ping all hosts without any problems, the phone - too. Headscale - is only management server.

Ok, you might be right. Some peers stay online, while very few goes off. Even tho all of them are connected directly.

thebigbone avatar Jul 20 '25 18:07 thebigbone

adding an option for using multiple headscale servers to distribute the load as well as making sure every server syncs the config so the tailnet is highly available. are there any plans for such a feature?

The short answer is no. But I will break it down a little more and add some reasoning.

making sure every server syncs

This quickly complicates the server by potentially order of magnitude making simple bugs hard to debug, and harder bugs even harder. It is feasible for some sort of systems and particularly if you have a lot of developers, but as I see it, the net gain for a project like this, it just does not make sense.

In addition, it significantly complicates your runtime setup, you now need to ensure you have multiple replicas, databases, there might be split brain issues, recovery becomes harder.

What I think people should focus on in their strategy is:

The Tailscale client has a lot of redundancy built in, as mentioned above, in general if the client has an up to date map, everything continue to work as long not too many nodes move at the same time. Technically if one side of the nodes move, it could still work as one node can reach the other to establish the connection. This has been built in since the beginning and I believe Tailscale runs quite a simple setup themselves where this is key part of the strategy. That said, there might be differences between Headscale and Tailscale here, where we do not implement everything correctly so all of these things dont work, and of course we should continue to improve that.

To the last point, this means that 5-10 minute, even an hour outage should not be noticed to much as long as your not "changing the shape of your network" so too much movement or new nodes etc. To this I will say that instead of a HA setup, you can easier focus on a much simpler "how quickly can I recover or replace my server?" setup.

Minimum setup for this should allow you to recover your Headscale from a backup in minutes:

  • DNS or Virtual IP, point to a new VM
  • Restore SQLite database and config from backup
  • Install Headscale and start it up

kradalby avatar Jul 23 '25 09:07 kradalby

you can backup sqlite easily / restore on start with https://litestream.io/reference/ or something like this project https://github.com/reneleonhardt/harmonylite ( to replicate sqlite via nats.io )

blinkinglight avatar Aug 30 '25 22:08 blinkinglight

there's also dqlite and rqlite. I think rqlite does not need code changes and can be used as a drop-in replacement

dzervas avatar Aug 31 '25 13:08 dzervas

adding an option for using multiple headscale servers to distribute the load as well as making sure every server syncs the config so the tailnet is highly available. are there any plans for such a feature?

The short answer is no. But I will break it down a little more and add some reasoning.

making sure every server syncs

This quickly complicates the server by potentially order of magnitude making simple bugs hard to debug, and harder bugs even harder. It is feasible for some sort of systems and particularly if you have a lot of developers, but as I see it, the net gain for a project like this, it just does not make sense.

In addition, it significantly complicates your runtime setup, you now need to ensure you have multiple replicas, databases, there might be split brain issues, recovery becomes harder.

What I think people should focus on in their strategy is:

The Tailscale client has a lot of redundancy built in, as mentioned above, in general if the client has an up to date map, everything continue to work as long not too many nodes move at the same time. Technically if one side of the nodes move, it could still work as one node can reach the other to establish the connection. This has been built in since the beginning and I believe Tailscale runs quite a simple setup themselves where this is key part of the strategy. That said, there might be differences between Headscale and Tailscale here, where we do not implement everything correctly so all of these things dont work, and of course we should continue to improve that.

To the last point, this means that 5-10 minute, even an hour outage should not be noticed to much as long as your not "changing the shape of your network" so too much movement or new nodes etc. To this I will say that instead of a HA setup, you can easier focus on a much simpler "how quickly can I recover or replace my server?" setup.

Minimum setup for this should allow you to recover your Headscale from a backup in minutes:

  • DNS or Virtual IP, point to a new VM
  • Restore SQLite database and config from backup
  • Install Headscale and start it up

I've been looking for a solution to the high availability problem for a while over the weekend. I write how to create auto replication of sqlite and auto failover switch to replica node All I posted in my blog: https://gawsoft.com/blog/headscale-litefs-consul-replication-failover/

gawsoftpl avatar Sep 08 '25 00:09 gawsoftpl

@kradalby I'm a bit confused by comments here and the general decision to support Postgres in a way that's overtly precarious (maintenance mode). I've been prototyping for a good size project the last few weeks, one that would almost certainly lead to in/direct support—Headscale is the obvious place/base to start from and early patches would probably be scalability related—but it becomes a significantly tougher sell to both my own good conscious and the wider engineering org when I have to explain only SQLite is supported (great DB but quite challenging in stateless/ephemeral environments), and also, there's little interest in HA too.

I guess what I'm trying to say is, this approach might be leaving money/time/skills/contributors/??? on the table. I 100% understand and respect the desire to Keep It Simple; alas, if I e.g. say out loud in a tech review that "the Headscale control plane could be down for minute(s) plural and this is an acceptable outcome because there is little appetite for fixing this upstream", Headscale is likely to get (reasonably?) dismissed, and Nice Things never happen.

Since I still feel that it's probably the best base for me to start from, what would it take to solidify your thinking around the handful of important things wrt improving core scalability? In my mind, the two most important pieces are full support for externalized database state and the ability to run multiple copies without much fuss.

I also understand there's likely to be other issues around rebuilding the "world map", and I'd hoped to help cross that bridge when the time comes; my fear is I'll not be allowed to cross said bridge because there will be no interest in what I have to offer.

anthonyrisinger avatar Oct 11 '25 05:10 anthonyrisinger

What about reconsidering distributed SQLite flavours, earlier proposed by @dzervas?

The dqlite-fork cowsql is happily in use for production use cases, e.g. by the Golang Incus project.

almereyda avatar Oct 15 '25 15:10 almereyda

I will also note that this is an important feature for me. Regardless of other conditions, it turns out that Headscale, in its current form, is a tool for hobbies, not for production solutions, which is sad. (Or you have to use homemade solutions or accept the risks.)

I recall Kafka, Postgres, and even Kubernetes itself, which successfully solve the task of high availability. Maybe it's worth looking for some elegant and simple approaches in the open-source community?

Even I studied how to use leases for my experiments with operators in k8s, and it doesn't look like rocket science... for k8s installation, of course, but without external database support from Headscale itself, it's unlikely to implement a good "quick and dirty" solution.

ksemele-public avatar Nov 06 '25 15:11 ksemele-public