headscale
headscale copied to clipboard
[Bug] Group-based ACL not working for local users
Is this a support request?
- [x] This is not a support request
Is there an existing issue for this?
- [x] I have searched the existing issues
Current Behavior
In my acl, I have mutliple groups. All of the usernames end in an @, like these:
"group:/Benutzer": ["maprambo@"],
"group:server": ["prod@"],
The first entry and all its rules are respected and they work, the second group is used in the same rules and does not work. There is a difference between these users: maprambo is from OIDC with the preferred_username="maprambo" prod is a local user with the username="prod" (both without @)
Both look the same in the users table:
| ID | Name | Username | Created | |
|---|---|---|---|---|
| 3 | prod | 2023-09-14 21:32:34 | ||
| 11 | Full Name | maprambo | 2025-05-14 18:21:25 |
The hosts from the user prod accept the rules when they are assigned a tag that is also listed as a source (see below) -- but also only after a logout and login.
Expected Behavior
Both groups and their rules should be working, no matter if the user is from OIDC or local
Steps To Reproduce
- Create one local user
- Create one OIDC user
- Add both users to a group
- Use the group in ACL
Environment
- OS: docker image
- Headscale version: 0.26.0
- Tailscale version: 1.84.0
Runtime environment
- [x] Headscale is behind a (reverse) proxy
- [x] Headscale runs in a container
Debug information
ACL:
{
"groups": {
"group:/Benutzer": ["maprambo@"],
"group:server": ["prod@"],
},
"tagOwners": {
"tag:benutzer-router": ["group:/Administrator/Netzwerk"],
},
"Hosts": {
"revproxy.internal": "10.7.2.7/32",
},
"acls": [
{
"action": "accept",
"src": [
"group:server",
"tag:benutzer-router", // <-- with this tag assigned, the nodes from the group server get this rule and its route
"group:/Benutzer"
],
"dst": [
"revproxy.internal:80,443,3023",
"tag:int-revproxy:80,443,3023"
],
},
],
}
Ran into the same issue, and I'm using local users exclusively (so it probably has nothing to do with OICD users being present). Using tags for the time being works, but any acl rule that involves groups containing local users in the source doesn't seem to be functional. Unfortunately, since the upgrade seems to have nuked some tables from the database it's impossible to downgrade to 0.25.
I think we might be running into this as well, but it's very hard to debug because everything sometimes works, sometimes doesn't. A ton of errors in tailscaled like open-conn-track: timeout opening (TCP xx => xx) to node [xx]; online=yes, lastRecv=2h19m53s.
I encountered the same issue with headscale 0.26.1. After updating the machine's tag, the new rules only take effect on the client after restarting headscale.
maybe this is also a related problem here, running v0.26.1:
we now (need to) use the group ACL feature after the authentication-flow rework in v0.25.0
only OIDC users are in our groups, there are no local users anywhere in the ACL
we sometimes see similar issues with ACLs not applying to users:
it’s somewhat random whether a group membership, and any connected ACL, gets populated on node connection into the compiled filter / matchers
checked via the policy-manager debug endpoint:
curl -s "http://172.16.0.53:9090/debug/policy-manager?debugkey=xxxxxxx"
we’ve been debugging this for 2 days now, but no clear pattern emerges with node last-connected, user logins, or node-key validity, noting seems to correlate atm.
Found a correlation for my problem: If any (forced?) tag is present on a node, the group policies from the node’s user will not be populated on this node.
=> Don’t mix groups and tags, for now?
Can confirm what @aritas1 noted. That behavior is in line with tailscale docs, see here for tagging (https://tailscale.com/kb/1068/tags#use-cases). Applying a tag to a device removes any user-based authentication so if you tag the device you are connecting from then the ACLs won't see the user tied to the device and any group policy which includes the user will be rendered effectively useless for that user on that device. As @rainbend noted once you remove the tags you will need to restart your headscale instance. I was able to get user/group based ACLs working properly after removing tags from the device I was connecting from. It doesn't exactly work how you would expect with tagging but thats a tailscale issue not headscale.
So I am experiencing issues seemingly related to this for anything after v0.23.0, the question being, why did it work in v0.23.0 without issues? What was broken about tags that got "fixed"?
The reason why I liked tags:
I could create tags and set them as the source in my ACLS, assign the tags to various clients nodes and dictate their remote access all without modify ACLs moving forward through Headscale Admin. This was before the UI had an ACL editor....
For example: if I wanted to give internet access, just assign the tag internet to a node, etc, etc.
I realize the way in which I was using them was likely against Tailscale best practices and now I am paying the price for it BUT for my use case it JUST MADE SENSE to me. Sadly this is deterring me from moving on, while sacrificing potential security improvements.
I just want to add one final note on behavior...
v0.23.0 adding a tag to a node through the headscale admin UI did not result in the node also advertising said tag... When applying a tag on v0.23.0 I would usually have to restart the client for it to pick up access but removing the tag would immediately revoke access.
Moving forward to v0.24.0 I am noticing adding a tag also causes the node to advertise that tag in Headscale Admin UI, the node does not immediately pick up access when restarting client and removing a tag does not immediately revoke access..
As noted the tags either lag behind or do not take effect until the control server is restarted. I'm doing this through the web UI just in case that matters, which it very well may. The behavior is being displayed on OIDC and preauth key registered nodes.
This STILL seems like a bit of a regression. I'm screaming at the screen, it ain't a bug it's a feature! 😂 If it's impossible to function in this manner when it comes to staying in line with the official Tailscale implementation I completely understand but I AM HOPING for this to be fixed. Has anyone played with official implementation of Tailscale in this manner, what is the behavior when adding and removing tags from nodes when ACLs are written with TAGs defined as the source?
HERE IS A SAMPLE OF MY ACLs on v0.23.0:
"groups": {
"group:netlabwork": [
""
],
"group:admin": [
""
],
"group:exit": [
""
],
"group:unrestricted": [
""
],
"group:atakcot": [],
"group:dns": []
},
"tagOwners": {
"tag:internet": [
"group:exit"
],
"tag:unrestricted": [
"tag:unrestricted"
],
"tag:nlw": [
"group:netlabwork"
],
"tag:atakcot": [
"group:atakcot"
],
"tag:dns": [
"group:dns"
]
},
"hosts": {
"netlabwork-lan": "10.0.0.0/24",
"atak": "10.0.0.108/32",
"win25-service": "10.0.0.16/32",
"router01": "10.0.0.1/32"
},
"acls": [
{
"action": "accept",
"src": [
"tag:unrestricted"
],
"dst": [
"*:*"
]
},
{
"action": "accept",
"src": [
"tag:nlw"
],
"dst": [
"netlabwork-lan:*"
]
},
{
"action": "accept",
"src": [
"tag:nlw"
],
"dst": [
"tag:nlw:*"
]
},
{
"action": "accept",
"src": [
"tag:internet"
],
"dst": [
"autogroup:internet:*"
]
},
{
"action": "accept",
"proto": "tcp",
"src": [
"tag:atakcot"
],
"dst": [
"atak:8087"
]
},
{
"action": "accept",
"src": [
"tag:dns"
],
"dst": [
"win25-service:53"
],
"proto": "udp"
},
{
"action": "accept",
"proto": "udp",
"src": [
"tag:dns"
],
"dst": [
"router01:53"
]
}
],
"ssh": []
}````
Found a correlation for my problem: If any (forced?) tag is present on a node, the group policies from the node’s user will not be populated on this node.
=> Don’t mix groups and tags, for now?
Banged my head for a few days until I discovered this. Thanks!
This helped, thanks. Replacing group:admin entries with the list of individual username@s e.g. within a src statement in the ssh list of the policy (file) restored access to the nodes.
Here all our users come from OIDC, thus this behaviour does not seem to be restricted to what is called local users in this issue.
There are multiple other side-effects involved here. Listing related issues to make the reference explicit:
@rainbend After updating the machine's tag, the new rules only take effect on the client after restarting headscale.
- #2375
- #2389
@aritas1 If any (forced?) tag is present on a node, the group policies from the node’s user will not be populated on this node.
- #1369
More on these directions on #2417
In #2674 we have also seen a regression based on #2411 and #2651 where OIDC users weren't properly mapped/migrated, leading to duplicate users (unfiled) and thus duplicate nodes. The duplicate users would break selection in the policy, since none of the two would resolve, leading to empty principals under consideration of the other issues above.