headscale
headscale copied to clipboard
No connections reported after upgrading to version v0.23.0 from version v0.22.3
Is this a support request?
- [X] This is not a support request
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
OS: Debian GNU/Linux 12 (bookworm) x86_64
Once updated all tailscale nodes show offline (Connected: offline) when running
sudo headscale nodes list
and
https://headscale.domainname.com/windows
is also inaccessible (I am using tls_letsencrypt_challenge_type: TLS-ALPN-01)
The service runs on a restart
sudo systemctl restart headscale.service
I have migrated my config file to align with your new example config. I have migrated my acl.yml policy file to the new huJSON format (acl.hujson). I have tried disabling the use of ACLs by setting path: "" under the policy. I have tried preventing Headscale from managing DNS by setting all fields under dns to empty values. I’ve also tried disabling UFW. I am using the latest version of Tailscale on all of my nodes.
However, all my attempts have failed. When I roll back to v0.22.3, everything works.
Are there any known issues with using v0.23.0 on Debian 12? Please could you suggest where I might be going wrong?
Thanks in advance.
Expected Behavior
To continue working once upgraded to v0.23.0
Steps To Reproduce
Install version v0.23.0 on Debian 12
Environment
- OS: Debian GNU/Linux 12 (bookworm) x86_64
- Headscale version: v0.23.0
- Tailscale version: 1.74.1
Runtime environment
- [ ] Headscale is behind a (reverse) proxy
- [ ] Headscale runs in a container
Anything else?
No response
The service runs on a restart sudo systemctl restart headscale.service
Can you paste the output of sudo systemctl status headscale.service and sudo journalctl -u headscale.service -f
please?
Probably, headscale is not running/not listening, can you verify with sudo ss -tlen please?
Are there any known issues with using v0.23.0 on Debian 12?
No, at least I'm not aware of it.
I did just upgrade from 0.22.3 to 0.23.0 and have the same problem.
The service is dead and in the log it looks likes this:
Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Failed to fetch machine from the database with node key: nodekey:abc... handler=NoisePollNetMap Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR error getting routes error="sql: database is closed" Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Error listing users error="sql: database is closed" Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Error listing users error="sql: database is closed"
and
Oct 09 07:55:18 maja systemd[1]: headscale.service: State 'stop-sigterm' timed out. Killing. Oct 09 07:55:18 maja systemd[1]: headscale.service: Killing process 624 (headscale) with signal SIGKILL. Oct 09 07:55:18 maja systemd[1]: headscale.service: Failed to kill control group /system.slice/headscale.service, ignoring: Invalid argument Oct 09 07:55:18 maja systemd[1]: headscale.service: Main process exited, code=killed, status=9/KILL Oct 09 07:55:18 maja systemd[1]: headscale.service: Failed with result 'timeout'. Oct 09 07:55:18 maja systemd[1]: Stopped headscale.service - headscale coordination server for Tailscale. Oct 09 07:55:18 maja systemd[1]: headscale.service: Consumed 2h 39min 9.793s CPU time, 57.5M memory peak, 0B memory swap peak.
I did not replace my current config file as this was the default option.
I did just upgrade from 0.22.3 to 0.23.0 and have the same problem.
What happens if you restart headscale?
root@maja:/home/sysman# systemctl restart headscale root@maja:/home/sysman# systemctl status headscale headscale.service - headscale coordination server for Tailscale Loaded: loaded (/usr/lib/systemd/system/headscale.service; disabled; preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Wed 2024-10-09 09:39:11 UTC; 2s ago Process: 891 ExecStart=/usr/bin/headscale serve (code=exited, status=1/FAILURE) Main PID: 891 (code=exited, status=1/FAILURE) CPU: 31ms
Oct 09 09:39:11 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE Oct 09 09:39:11 maja systemd[1]: headscale.service: Failed with result 'exit-code'.
headscale.service: Failed with result 'exit-code'.
and the corresponding logs from the journal?
I guess this is what you are asking for:
Oct 09 09:53:59 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE Oct 09 09:53:59 maja systemd[1]: headscale.service: Failed with result 'exit-code'. Oct 09 09:54:04 maja systemd[1]: headscale.service: Scheduled restart job, restart counter is at 177. Oct 09 09:54:04 maja systemd[1]: Started headscale.service - headscale coordination server for Tailscale. Oct 09 09:54:04 maja headscale[2221]: 2024-10-09T09:54:04Z FTL Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.override_local_dns" configuration key is deprecated and has been removed. Please see the changelog for more details. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.magic_dns" configuration key is deprecated. Please use "dns.magic_dns" instead. "dns_config.magic_dns" has been removed. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.base_domain" configuration key is deprecated. Please use "dns.base_domain" instead. "dns_config.base_domain" has been removed. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.nameservers" configuration key is deprecated. Please use "dns.nameservers.global" instead. "dns_config.nameservers" has been removed. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.domains" configuration key is deprecated. Please use "dns.search_domains" instead. "dns_config.domains" has been removed. Oct 09 09:54:04 maja headscale[2221]: FATAL: The "acl_policy_path" configuration key is deprecated. Please use "policy.path" instead. "acl_policy_path" has been removed. Oct 09 09:54:04 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE Oct 09 09:54:04 maja systemd[1]: headscale.service: Failed with result 'exit-code'.
I just read the changelog. I will probably solve this by myself. I will comment here how it goes.
Yes. It seems the configuration needs to be adjusted for 0.23: Oct 09 09:54:04 maja headscale[2221]: FATAL: The "acl_policy_path" configuration key is deprecated. Please use "policy.path" instead. "acl_policy_path" has been removed.
I took the sample-config and changed it according to changes in the old one.
Now it works perfect.
Sorry to bother you.
Thanks for an excelent product!
HI @nblock
This is the output of sudo journalctl -u headscale.service -f
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: WARN: The "dns.use_username_in_magic_dns" configuration key is deprecated and has been removed. Please see the changelog for more details.
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 WRN Warning: when using tls_letsencrypt_hostname with TLS-ALPN-01 as challenge type, headscale must be reachable on port 443, i.e. listen_addr should probably end in :443
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF Setting up a DERPMap update worker frequency=86400000
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF Enabling remote gRPC at 0.0.0.0:50443
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving gRPC on: 0.0.0.0:50443
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving HTTP on: 0.0.0.0:8080
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving debug and metrics on: 0.0.0.0:9090
This line appears to have been the issue:
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 WRN Warning: when using tls_letsencrypt_hostname with TLS-ALPN-01 as challenge type, headscale must be reachable on port 443, i.e. listen_addr should probably end in :443
So in the configuration file /etc/headscale/config.yml
changing
# For production:
# listen_addr: 0.0.0.0:8080
listen_addr: 0.0.0.0:8080
to
# For production:
# listen_addr: 0.0.0.0:8080
listen_addr: 0.0.0.0:443
Solved the connection error and now all nodes are connected.
In version 0.22.3 listen_addr: 0.0.0.0:8080 worked without issue and I also received valid tls certs. Do you know if this is new expected behavior?
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
There seems to be an issue with your ACL, too.
In version 0.22.3 listen_addr: 0.0.0.0:8080 worked without issue and I also received valid tls certs. Do you know if this is new expected behavior?
I don't know, @kradalby what do you think? As per https://github.com/juanfont/headscale/issues/2164#issuecomment-2391011341 it is strongly recommended to use HTTPS on 443.
Thanks @nblock for pointing out the acl. After scanning the internet I cannot find any other reference to users setting headscale to listen on 0.0.0.0:443. Could this be related to the use of the tls_letsencrypt_challenge_type: TLS-ALPN-01.
I run in the same issue:
WARN: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.
that is of course fine as stated it in the change log and fixed it. However if the service will refuse to start, please state it as an Error Or Critical instead of warning, as it might quicken the troubleshooting.
So it would be nice if it state:
ERR: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.
or
CRIT: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.
A warning suggest that should should take a look, but not, that you must take a look at it.
I also get the error No IPs found with the alias. ACL is the same.
It seems to happen with a rule like { "action": "accept", "src": ["user"], "dst": ["user:*"] }. Allowing users to access their own devices
This issue is stale because it has been open for 90 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.