AdGuardHome icon indicating copy to clipboard operation
AdGuardHome copied to clipboard

Fix #4714: correctly check thread can bind to privileged ports

Open yscialom opened this issue 2 years ago • 11 comments

Context

In order for a non-root user to run AdGuardHome and allow it to bind to privileged ports (e.g. DNS tcp/53), linux offers a set of features collectively called capabilities. It can be simply described1 but is hard to master. And I don't pretend I do.

Issue

As detailed in issues/4714, AduardHome starts by checking it has permission to bind to privileged ports. Unfortunately, the way that check is done makes it return a false negative in some conditions.

Change

This PR fixes this test, and allow non-root users to run AdGuardHome in a docker container.

Details

How it was

In the modified file, one can find the function definition of canBindPrivilegedPorts(). It delegues the check to unix.PrctlRetInt, a simple bind to Linux' prctl(). The code used to search the capability CAP_NET_BIND_SERVICE in the ambient set (PR_CAP_AMBIENT_IS_SET).

But... even though the thread has this capability, CAP_NET_BIND_SERVICE has not (yet) been raised into its ambiant set, and the test fails leading to a premature exit of the program.

How it will be

The fix is rather simple: we raise the capability to the ambient set.

If the call succeeds, the thread has indeed the privilege to bind to those ports. If it fails, it doesn't. If we happen by some spaghettisation to call multiple times canBindPrivilegedPorts(), the way capabilities and sets of capabilities have been implemented guaranties nothing happens.

Alternatives

"Belt and Braces"

An alternative would be to both raise then check:

func allowBindPrivilegedPorts() (can bool, err error) {
	cnbs, err := unix.PrctlRetInt(
		unix.PR_CAP_AMBIENT,
		unix.PR_CAP_AMBIENT_RAISE,
		unix.CAP_NET_BIND_SERVICE,
		0,
		0,
	)
	// Don't check the error because it's always nil on Linux.
	adm, _ := aghos.HaveAdminRights()

	return cnbs == 1 || adm, err
}

func canBindPrivilegedPorts() (can bool, err error) {
	cnbs, err := unix.PrctlRetInt(
		unix.PR_CAP_AMBIENT,
		unix.PR_CAP_AMBIENT_IS_SET,
		unix.CAP_NET_BIND_SERVICE,
		0,
		0,
	)
	// Don't check the error because it's always nil on Linux.
	adm, _ := aghos.HaveAdminRights()

	return cnbs == 1 || adm, err
}

IMHO, this brings nothing of value. Let me know if you disagree.

"Can I bind to 53? Let's find out!"

A simpler alternative would be to drop completly the idea to ask if we can, but simply do and see what comes.

It could be done pre-emptively:

func canBindPrivilegedPorts() (can bool, err error) {
    // bind to port 53
    // unbind
    return bound_ret, err
}

As I'm not a GO developer, I couldn't implement this alternative. If you'd prefer it, please express it by providing a PR.


1 man capabilities

For the purpose of performing permission checks, traditional UNIX implementations distinguish two categories of processes: privileged processes (whose effective user ID is 0, referred to as superuser or root), and unprivileged processes (whose effective UID is nonzero). Privileged processes bypass all kernel permission checks, while unprivileged processes are subject to full permission checking based on the process's credentials (usually: effective UID, effective GID, and supplementary group list).

Starting with kernel 2.2, Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled. Capabilities are a per-thread attribute.

yscialom avatar Jul 09 '22 15:07 yscialom

@ainar-g & @EugeneOne1 Code review please?

yscialom avatar Aug 01 '22 12:08 yscialom

@yscialom, thanks for the contribution and sorry for a late response. The PR is seemingly ok, but we'd like to extensively test it on different devices before merging.

BTW, your code introduces not a lot of changes and is fully compliant with our guildelines. Here they are in case you interested.

EugeneOne1 avatar Aug 08 '22 17:08 EugeneOne1

@yscialom, hello again. A couple of thigs to clarify about the implementation:

  1. Does it fix issue 4714 for your setup?
  2. Does it work with no capability manipulations at all?

The PR is really good at stating the problem but it seems our current implementation requires a different approach. We suggest:

  1. Put raising the permissions into a separate function in the same file internal/aghnet/net_linux.go. It should return the error if the capabilities couldn't be raised. For the rest OSes it should always return nil. Wrap it with exported function (e.g. AcquirePermissions) in some OS-independent file (internal/aghnet/net.go would be right). This is a widespread pattern of wrapping the OS-dependent code, check the aghnet.CanBindPrivilegedPorts for an example.
  2. Call the new exported function in the home.checkPermissions just above the aghnet.CanBindPrivilegedPorts call. If it fails, the error should be printed to the Info log and be followed by the aghnet.CanBindPrivilegedPorts. The success makes calling the aghnet.CanBindPrivilegedPorts redundant.

Also, I believe the check for having the net_bind_service capability should be rewritten to:

res, err := unix.PrctlRetInt(
	unix.PR_CAPBSET_READ,
	unix.CAP_NET_BIND_SERVICE,
	0,
	0,
	0,
)

What do you think about the solution?

EugeneOne1 avatar Aug 09 '22 16:08 EugeneOne1

Hello @EugeneOne1, sorry for the delay, I'm coming back from holidays.

Does it fix issue 4714 for your setup?

Yes it does.  

Does it work with no capability manipulations at all?

Shame on me, I didn't even try. Though I suppose it does, with high level of confidence. Unfortunatly, I'm not in a position where I can build AGH right now, unless you can provide me with a docker image to do so.  

What do you think about the solution?

I agree it is a net (pun intended) improvement.  

Regarding the two improvements you suggested on the code structural organisation, I cannot give any advice on them: I'm not a Go developper and I'm unfamiliar with your codebase.

With that being said, how can I help you going forward?

Here is my endgoal: a PR to allow the default docker image of AGH to be runnable as a non-root user by simply defining the environment variable PUID (à la linuxserver.io).

yscialom avatar Aug 23 '22 08:08 yscialom

@EugeneOne1 Hello, what is the status of this PR? I understand your suggestions but also I am unable to implement them. Would you rather settle for a suboptimal & working solution or for a perfect one only you could code?

yscialom avatar Nov 14 '22 12:11 yscialom

@yscialom, hello and sorry for the delay. The developers team is unfortunately busy for the moment, but we're keeping this PR opened to merge it in future with our suggestions reconsidered and implemented. We'll mention you here and in the release notes, when it'll be merged. Thank you for the contribution.

EugeneOne1 avatar Nov 14 '22 13:11 EugeneOne1

No worries. I understand this is not high priority, and to be honnest I'm happy to see it kept for later merge. Keep the good work there!

yscialom avatar Nov 14 '22 13:11 yscialom

Any plan to merge this fix soon?

abnayak avatar Jan 15 '23 00:01 abnayak

@yscialom, hello and apologies for late response. We've pushed the modified version of this PR to the separate branch. Could you please try to build AdGuard Home from the latest commit (3918789ca72dae146064c0668983763b33abe10f) and tell if it works in the described scenario? If it doesn't, we'd also like to check out verbose logs.

By the way, which purpose does entrypoint.sh file serve in the referenced PR?

EugeneOne1 avatar Feb 06 '23 13:02 EugeneOne1