headscale ✨ feat: implements APIs for managing ACLs

Description

Headscale currently lacks the APIs to manage the ACLs. The only way possible is presently to load the ACLs via file and changes to the policy require reloading the headscale process. This also makes it difficult to integrate your headscale via APIs with no ACL management.

This commit introduces two APIs allowing you to set the policy.

APIs

Set ACL Policy

PUT /api/v1/policy

Payload (Not JSON, so convert to string before invoking the API.)

{"policy": "{// This is a comment\n\"groups\":{\"g1\":[\"u1\",\"u2\"]},\"acls\":[]}"}

Get ACL Policy

GET /api/v1/policy

Response

{"policy": {"groups": {}, "tags": {}}}

CLI

→ go run cmd/headscale/headscale.go --config ./config.yaml policy --help
Manage the Headscale ACL Policy

Usage:
  headscale policy [command]

Available Commands:
  get         Print the current ACL Policy
  set         Updates the ACL Policy

Flags:
  -h, --help   help for policy

Global Flags:
  -c, --config string   config file (default is /etc/headscale/config.yaml)
      --force           Disable prompts and forces the execution
  -o, --output string   Output format. Empty for human-readable, 'json', 'json-line' or 'yaml'

Use "headscale policy [command] --help" for more information about a command.

Testing

https://github.com/juanfont/headscale/assets/5305744/e3f68195-c78d-4cf3-8483-fb1418251137

[x] read the CONTRIBUTING guidelines
[x] raised a GitHub issue or discussed it on the projects chat beforehand
[x] added unit tests
[ ] added integration tests
[x] updated documentation if needed
[ ] updated CHANGELOG.md

Resolves

https://github.com/juanfont/headscale/issues/582

Feb 24 '24 11:02 pallabpain

@evenh @kradalby Please take a look. 🙇🏼

Mar 05 '24 18:03 pallabpain

I might have some free time this evening to do a review

Mar 06 '24 08:03 evenh

I've attached a screen recording to the PR in case someone wishes to see how the APIs work.

Mar 06 '24 20:03 pallabpain

Overall it looks good - I've left some inline comments. One open question is whether to support reading the current ACL via the new APIs/CLI commands even if the policy is backed by a file (for convenience)?

Enabled it. 👍🏼 Now it will work for both modes.

Mar 07 '24 06:03 pallabpain

I think this looks reasonable, some comments and questions and to larger things:

I am tempted to use this opportunity to change the general phrasing from ACL to Policy through out the code, particularly since we are changing the config, adding commands etc.

I think the base format for the policy should be HuJSON, it is a bit more flexible format that allows preserving comments. This probably means using string/byte for the database and not the JSON type, but we dont really use it (or will use SQL to look up in the JSON) so that should not be a problem. As part of this, I am willing to discuss dropping the YAML support to just streamline what we support.

Thanks for the review. This sounds like a good idea. I wrote this implementation to quickly demonstrate the APIs and commands that we can use as a reference to discuss the feature further.

Keeping the base format as HuJSON should be the way to proceed and we can store it as a bytearray or string in the db since we don't intend to query it any further. Although I'm not entirely sure how the API request and response will look over RPC since I do not want to model the policy structure in the proto definition.

I am also okay with keeping the default mode as file.

Let me update this PR with the suggestions and put it our for a second round of review.

Mar 07 '24 12:03 pallabpain

Hi @pallabpain. Have you had a chance to update the PR yet?

Mar 22 '24 10:03 evenh

Hi @pallabpain. Have you had a chance to update the PR yet?

I haven't had the chance to update the PR, yet. I have some untested changes locally that I should be able to work on this weekend. I may change the request-response structure based on the newer comments.

Mar 27 '24 06:03 pallabpain

Maybe, we should create 2 new tables to store groups and tags. And sometimes, we just want to update groups or tags. So /api/v1/acl/groups, /api/v1/acl/tags and /api/v1/acl/all may be more useful.

I'm thinking about how to update ACL groups automatically, e.g. AD/LDAP or extract from OIDC (OIDC may be a problem because it can only get groups when users authenticate)

May 09 '24 22:05 IamTaoChen

@pallabpain there has been quite some changes, do you think that you might have time to pick this up again?

May 24 '24 09:05 kradalby

@pallabpain there has been quite some changes, do you think that you might have time to pick this up again?

Sure. I'll review the recent changes and update the pull request. I'll try and get a working version ready soon.

May 24 '24 09:05 pallabpain

Hi, all this pull request is still active? I ask because I plan to create ui editor in package https://github.com/tale/headplane for update acl, and that api endpoint will be great opportunity.

May 30 '24 18:05 gawsoftpl

Re this

I'm open for the example config to be db, but the default should still be file.

Personally I strongly disagree.

Many people still use docker. (I know I know, it's not officially supported... still!) Manipulating files inside it, especially if the container is not running is a very hard task.

IMHO it is more secure to store ACL data inside a db. A separate file can be easier hacked / modified / viewed. After all, ACL is the "main security" config of HeadScale, telling who has access to what ! Please keep in mind: The main reason for using VPNs is security. Offering an easy-to-view / edit file to the open is against it.

Eventually more people will use this feature using a web-UI, so it would make much easier to setup with that already.

For those who like to do things manually (by typing text files, etc) -> changing the config to file will not be a big task to do, but for those who are not used to do tasks like this: it would be an "impossible nightmare".

Anyway: is the database password protected by default, so only HS can access it?

or can other programs access it too?

As someone who develops a web UI for Headscale I don't think that the default should be db mode, atleast not yet. I don't understand the security standpoint here, if someone is able to get into a machine where the policy file is stored they can view your config which will contain much more sensitive information (ie. database credentials, secret keys, OIDC credentials, etc).

On the web UI end, I'd expect API requests to the ACL endpoints to just return status 403 or something if file mode is enabled. It's very easy to handle the status code and inform users on a web UI that they need to change their config.

May 31 '24 19:05 tale

Maybe, we should create 2 new tables to store groups and tags. And sometimes, we just want to update groups or tags. So /api/v1/acl/groups, /api/v1/acl/tags and /api/v1/acl/all may be more useful.

I'm thinking about how to update ACL groups automatically, e.g. AD/LDAP or extract from OIDC (OIDC may be a problem because it can only get groups when users authenticate)

Sounds like a good idea to implement. However, if we get and set the entire policy file contents, it keeps things simple instead of exposing APIs to do specific things on the policy. This shall also keep headscale in-line with the Tailscale ACL APIs and someone migrating from headscale to Tailscale SaaS can do it without making a lot of changes.

Jun 02 '24 06:06 pallabpain

I would prefer keeping it simple and doing the whole thing now. Further extensions can be discussed later.

Jun 02 '24 06:06 kradalby

I think this looks reasonable, some comments and questions and to larger things:

I am tempted to use this opportunity to change the general phrasing from ACL to Policy through out the code, particularly since we are changing the config, adding commands etc.

I think the base format for the policy should be HuJSON, it is a bit more flexible format that allows preserving comments. This probably means using string/byte for the database and not the JSON type, but we dont really use it (or will use SQL to look up in the JSON) so that should not be a problem. As part of this, I am willing to discuss dropping the YAML support to just streamline what we support.

@kradalby Revised the PR. I've tried to keep the implementation simple in the first iteration. While I've updated the naming convention from ACL to Policy in some places, I've left some code untouched as of now. The default format will be HuJSON and the API will also support the same.

Please take a look.

Jun 02 '24 18:06 pallabpain

I think this is more or less there, a couple of small things.

I would also like to see a CLI integration test to validate the roundtrip:

Load new ACL

Read it out and compare

Check that changes have a applied (if possible since we dont have hotreloading)

Thanks for the review. I'll implement the suggestions and write the tests.

Jun 03 '24 12:06 pallabpain

Great, I think it would be great to try to get this into 0.23.0, I am looking to release a beta soonish so let me know if looking at this work before that is feasible, if not, there is no problem pushing it for later.

Jun 18 '24 08:06 kradalby

Great, I think it would be great to try to get this into 0.23.0, I am looking to release a beta soonish so let me know if looking at this work before that is feasible, if not, there is no problem pushing it for later.

@kradalby I've addressed the comments. I should be able to write the tests this week. But when is v0.23.0 planned for? Is there a due date or a cut-off date that you're looking at?

Please take a look at this comment as well: https://github.com/juanfont/headscale/pull/1792#discussion_r1649686735

Jun 22 '24 12:06 pallabpain

Sounds good, no hard cutoff, I'm on holiday for two weeks, so staying a bit offline, if you make the test I'll pop in and review and we can do an alpha.

Jun 22 '24 14:06 kradalby

I'm back, and I would love to cut a beta this week, do you think it would be feasible to push some tests before midweek~ so we have time for a review cycle? I'll do a re-review of the code now.

Jul 08 '24 11:07 kradalby

No problem, I'm available to take a stab at the integration tests now if you have not started. Maybe I could get it in shape to merge today.

Jul 12 '24 10:07 kradalby

@pallabpain I ended up working on another bug today, let me know if you will not have time to look at the integration tests and I'll have a look next week.

Jul 12 '24 15:07 kradalby

@pallabpain I ended up working on another bug today, let me know if you will not have time to look at the integration tests and I'll have a look next week.

I managed to find some time to look into the integration tests, tonight. While I understood how I'd go about implementing the tests, I may have to spend some time to figure out the right way to configure the headscale instance with the right policy mode. 😅

Jul 12 '24 17:07 pallabpain

Sounds good, have a look at the hsic.With options, there you can configure things, use environment variables instead of file.

Jul 12 '24 17:07 kradalby

Sounds good, have a look at the hsic.With options, there you can configure things, use environment variables instead of file.

@kradalby I have implemented a simple test that sets a policy, reads it back and then verifies the content. I had to expose WriteFile via the ControlServer interface to create a file for the headscale policy set --file FILE command.

Please take a look.

Jul 12 '24 19:07 pallabpain

I think that looks great, what would be great is a test in the ACLs that do something like:

Start with two nodes and no ACL, using database mode
have them ping eachother
apply a ACL with the command that makes them unable to connect
verify they can no longer connect

I can have a look at that tomorrow if you do not have time to take a stab at it, let me know.

Jul 14 '24 07:07 kradalby

I've added the test as described, but I found a bug I dont know if is new or old,

when a new policy is set, the nodes reload, but one of the nodes gets a packet filter that is wide open ":", the node belonging to user1, which should get an empty filter. I will continue investigating.

Jul 15 '24 10:07 kradalby

Ok, I found the bug, it likely hasnt been discovered since most people never reloaded ACLs without restarting the server.

If you set the packetfilter in tailcfg to an empty slice, it will marshall to nil, and that will cause it to be ignored, so the client will continue to use the previous filter: https://github.com/tailscale/tailscale/blob/main/tailcfg/tailcfg.go#L1863-L1865

I wonder if there is a way to get around this without copy and maintain a copy of the whole tailcfg 🤔 .

Jul 15 '24 12:07 kradalby

Ok, I found the bug, it likely hasnt been discovered since most people never reloaded ACLs without restarting the server.

If you set the packetfilter in tailcfg to an empty slice, it will marshall to nil, and that will cause it to be ignored, so the client will continue to use the previous filter: https://github.com/tailscale/tailscale/blob/main/tailcfg/tailcfg.go#L1863-L1865

I wonder if there is a way to get around this without copy and maintain a copy of the whole tailcfg 🤔 .

Thanks for adding the tests and resolving the issue. I can clean up the commits once the tests pass.

Jul 16 '24 08:07 pallabpain

Thanks for adding the tests and resolving the issue. I can clean up the commits once the tests pass.

Thank you, I think the last commit should resolve it, not sure why, but we can leave it for now.

Jul 16 '24 09:07 kradalby

headscale headscale copied to clipboard

✨ feat: implements APIs for managing ACLs

Description

APIs

Set ACL Policy

Get ACL Policy

CLI

Testing

Resolves

headscale
headscale copied to clipboard