headscale icon indicating copy to clipboard operation
headscale copied to clipboard

No connections reported after upgrading to version v0.23.0 from version v0.22.3

Open simonlock opened this issue 1 year ago • 12 comments

Is this a support request?

  • [X] This is not a support request

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

OS: Debian GNU/Linux 12 (bookworm) x86_64

Once updated all tailscale nodes show offline (Connected: offline) when running

sudo headscale nodes list

and

https://headscale.domainname.com/windows

is also inaccessible (I am using tls_letsencrypt_challenge_type: TLS-ALPN-01)

The service runs on a restart

sudo systemctl restart headscale.service

I have migrated my config file to align with your new example config. I have migrated my acl.yml policy file to the new huJSON format (acl.hujson). I have tried disabling the use of ACLs by setting path: "" under the policy. I have tried preventing Headscale from managing DNS by setting all fields under dns to empty values. I’ve also tried disabling UFW. I am using the latest version of Tailscale on all of my nodes.

However, all my attempts have failed. When I roll back to v0.22.3, everything works.

Are there any known issues with using v0.23.0 on Debian 12? Please could you suggest where I might be going wrong?

Thanks in advance.

Expected Behavior

To continue working once upgraded to v0.23.0

Steps To Reproduce

Install version v0.23.0 on Debian 12

Environment

- OS: Debian GNU/Linux 12 (bookworm) x86_64
- Headscale version: v0.23.0
- Tailscale version: 1.74.1

Runtime environment

  • [ ] Headscale is behind a (reverse) proxy
  • [ ] Headscale runs in a container

Anything else?

No response

simonlock avatar Oct 09 '24 00:10 simonlock

The service runs on a restart sudo systemctl restart headscale.service

Can you paste the output of sudo systemctl status headscale.service and sudo journalctl -u headscale.service -f please?

Probably, headscale is not running/not listening, can you verify with sudo ss -tlen please?

Are there any known issues with using v0.23.0 on Debian 12?

No, at least I'm not aware of it.

nblock avatar Oct 09 '24 04:10 nblock

I did just upgrade from 0.22.3 to 0.23.0 and have the same problem.

The service is dead and in the log it looks likes this: Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Failed to fetch machine from the database with node key: nodekey:abc... handler=NoisePollNetMap Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR error getting routes error="sql: database is closed" Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Error listing users error="sql: database is closed" Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Error listing users error="sql: database is closed"

and

Oct 09 07:55:18 maja systemd[1]: headscale.service: State 'stop-sigterm' timed out. Killing. Oct 09 07:55:18 maja systemd[1]: headscale.service: Killing process 624 (headscale) with signal SIGKILL. Oct 09 07:55:18 maja systemd[1]: headscale.service: Failed to kill control group /system.slice/headscale.service, ignoring: Invalid argument Oct 09 07:55:18 maja systemd[1]: headscale.service: Main process exited, code=killed, status=9/KILL Oct 09 07:55:18 maja systemd[1]: headscale.service: Failed with result 'timeout'. Oct 09 07:55:18 maja systemd[1]: Stopped headscale.service - headscale coordination server for Tailscale. Oct 09 07:55:18 maja systemd[1]: headscale.service: Consumed 2h 39min 9.793s CPU time, 57.5M memory peak, 0B memory swap peak.

I did not replace my current config file as this was the default option.

matsstralbergiis avatar Oct 09 '24 08:10 matsstralbergiis

I did just upgrade from 0.22.3 to 0.23.0 and have the same problem.

What happens if you restart headscale?

nblock avatar Oct 09 '24 09:10 nblock

root@maja:/home/sysman# systemctl restart headscale root@maja:/home/sysman# systemctl status headscale headscale.service - headscale coordination server for Tailscale Loaded: loaded (/usr/lib/systemd/system/headscale.service; disabled; preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Wed 2024-10-09 09:39:11 UTC; 2s ago Process: 891 ExecStart=/usr/bin/headscale serve (code=exited, status=1/FAILURE) Main PID: 891 (code=exited, status=1/FAILURE) CPU: 31ms

Oct 09 09:39:11 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE Oct 09 09:39:11 maja systemd[1]: headscale.service: Failed with result 'exit-code'.

matsstralbergiis avatar Oct 09 '24 09:10 matsstralbergiis

headscale.service: Failed with result 'exit-code'.

and the corresponding logs from the journal?

nblock avatar Oct 09 '24 09:10 nblock

I guess this is what you are asking for:

Oct 09 09:53:59 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE Oct 09 09:53:59 maja systemd[1]: headscale.service: Failed with result 'exit-code'. Oct 09 09:54:04 maja systemd[1]: headscale.service: Scheduled restart job, restart counter is at 177. Oct 09 09:54:04 maja systemd[1]: Started headscale.service - headscale coordination server for Tailscale. Oct 09 09:54:04 maja headscale[2221]: 2024-10-09T09:54:04Z FTL Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.override_local_dns" configuration key is deprecated and has been removed. Please see the changelog for more details. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.magic_dns" configuration key is deprecated. Please use "dns.magic_dns" instead. "dns_config.magic_dns" has been removed. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.base_domain" configuration key is deprecated. Please use "dns.base_domain" instead. "dns_config.base_domain" has been removed. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.nameservers" configuration key is deprecated. Please use "dns.nameservers.global" instead. "dns_config.nameservers" has been removed. Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.domains" configuration key is deprecated. Please use "dns.search_domains" instead. "dns_config.domains" has been removed. Oct 09 09:54:04 maja headscale[2221]: FATAL: The "acl_policy_path" configuration key is deprecated. Please use "policy.path" instead. "acl_policy_path" has been removed. Oct 09 09:54:04 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE Oct 09 09:54:04 maja systemd[1]: headscale.service: Failed with result 'exit-code'.

matsstralbergiis avatar Oct 09 '24 09:10 matsstralbergiis

I just read the changelog. I will probably solve this by myself. I will comment here how it goes.

matsstralbergiis avatar Oct 09 '24 10:10 matsstralbergiis

Yes. It seems the configuration needs to be adjusted for 0.23: Oct 09 09:54:04 maja headscale[2221]: FATAL: The "acl_policy_path" configuration key is deprecated. Please use "policy.path" instead. "acl_policy_path" has been removed.

nblock avatar Oct 09 '24 10:10 nblock

I took the sample-config and changed it according to changes in the old one.

Now it works perfect.

Sorry to bother you.

Thanks for an excelent product!

matsstralbergiis avatar Oct 09 '24 10:10 matsstralbergiis

HI @nblock

This is the output of sudo journalctl -u headscale.service -f

Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: WARN: The "dns.use_username_in_magic_dns" configuration key is deprecated and has been removed. Please see the changelog for more details.
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 WRN Warning: when using tls_letsencrypt_hostname with TLS-ALPN-01 as challenge type, headscale must be reachable on port 443, i.e. listen_addr should probably end in :443
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF Setting up a DERPMap update worker frequency=86400000
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF Enabling remote gRPC at 0.0.0.0:50443
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving gRPC on: 0.0.0.0:50443
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving HTTP on: 0.0.0.0:8080
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving debug and metrics on: 0.0.0.0:9090

This line appears to have been the issue: Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 WRN Warning: when using tls_letsencrypt_hostname with TLS-ALPN-01 as challenge type, headscale must be reachable on port 443, i.e. listen_addr should probably end in :443

So in the configuration file /etc/headscale/config.yml

changing

# For production:
# listen_addr: 0.0.0.0:8080
listen_addr: 0.0.0.0:8080

to

# For production:
# listen_addr: 0.0.0.0:8080
listen_addr: 0.0.0.0:443

Solved the connection error and now all nodes are connected.

In version 0.22.3 listen_addr: 0.0.0.0:8080 worked without issue and I also received valid tls certs. Do you know if this is new expected behavior?

simonlock avatar Oct 09 '24 19:10 simonlock

Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user

There seems to be an issue with your ACL, too.

In version 0.22.3 listen_addr: 0.0.0.0:8080 worked without issue and I also received valid tls certs. Do you know if this is new expected behavior?

I don't know, @kradalby what do you think? As per https://github.com/juanfont/headscale/issues/2164#issuecomment-2391011341 it is strongly recommended to use HTTPS on 443.

nblock avatar Oct 10 '24 04:10 nblock

Thanks @nblock for pointing out the acl. After scanning the internet I cannot find any other reference to users setting headscale to listen on 0.0.0.0:443. Could this be related to the use of the tls_letsencrypt_challenge_type: TLS-ALPN-01.

simonlock avatar Oct 12 '24 15:10 simonlock

I run in the same issue:

WARN: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.

that is of course fine as stated it in the change log and fixed it. However if the service will refuse to start, please state it as an Error Or Critical instead of warning, as it might quicken the troubleshooting.

So it would be nice if it state:

ERR: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.

or

CRIT: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.

A warning suggest that should should take a look, but not, that you must take a look at it.

devz3r0 avatar Nov 03 '24 11:11 devz3r0

I also get the error No IPs found with the alias. ACL is the same. It seems to happen with a rule like { "action": "accept", "src": ["user"], "dst": ["user:*"] }. Allowing users to access their own devices

dgrr avatar Nov 25 '24 16:11 dgrr

This issue is stale because it has been open for 90 days with no activity.

github-actions[bot] avatar Feb 24 '25 02:02 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Mar 03 '25 02:03 github-actions[bot]