AdGuardHome
AdGuardHome copied to clipboard
Fix #4714: correctly check thread can bind to privileged ports
Context
In order for a non-root user to run AdGuardHome and allow it to bind to privileged ports (e.g. DNS tcp/53), linux offers a set of features collectively called capabilities. It can be simply described1 but is hard to master. And I don't pretend I do.
Issue
As detailed in issues/4714, AduardHome starts by checking it has permission to bind to privileged ports. Unfortunately, the way that check is done makes it return a false negative in some conditions.
Change
This PR fixes this test, and allow non-root users to run AdGuardHome in a docker container.
Details
How it was
In the modified file, one can find the function definition of canBindPrivilegedPorts()
. It delegues the check to unix.PrctlRetInt
, a simple bind to Linux' prctl()
. The code used to search the capability CAP_NET_BIND_SERVICE
in the ambient set (PR_CAP_AMBIENT_IS_SET
).
But... even though the thread has this capability, CAP_NET_BIND_SERVICE
has not (yet) been raised into its ambiant set, and the test fails leading to a premature exit of the program.
How it will be
The fix is rather simple: we raise the capability to the ambient set.
If the call succeeds, the thread has indeed the privilege to bind to those ports. If it fails, it doesn't. If we happen by some spaghettisation to call multiple times canBindPrivilegedPorts()
, the way capabilities and sets of capabilities have been implemented guaranties nothing happens.
Alternatives
"Belt and Braces"
An alternative would be to both raise then check:
func allowBindPrivilegedPorts() (can bool, err error) {
cnbs, err := unix.PrctlRetInt(
unix.PR_CAP_AMBIENT,
unix.PR_CAP_AMBIENT_RAISE,
unix.CAP_NET_BIND_SERVICE,
0,
0,
)
// Don't check the error because it's always nil on Linux.
adm, _ := aghos.HaveAdminRights()
return cnbs == 1 || adm, err
}
func canBindPrivilegedPorts() (can bool, err error) {
cnbs, err := unix.PrctlRetInt(
unix.PR_CAP_AMBIENT,
unix.PR_CAP_AMBIENT_IS_SET,
unix.CAP_NET_BIND_SERVICE,
0,
0,
)
// Don't check the error because it's always nil on Linux.
adm, _ := aghos.HaveAdminRights()
return cnbs == 1 || adm, err
}
IMHO, this brings nothing of value. Let me know if you disagree.
"Can I bind to 53? Let's find out!"
A simpler alternative would be to drop completly the idea to ask if we can, but simply do and see what comes.
It could be done pre-emptively:
func canBindPrivilegedPorts() (can bool, err error) {
// bind to port 53
// unbind
return bound_ret, err
}
As I'm not a GO developer, I couldn't implement this alternative. If you'd prefer it, please express it by providing a PR.
For the purpose of performing permission checks, traditional UNIX implementations distinguish two categories of processes: privileged processes (whose effective user ID is 0, referred to as superuser or root), and unprivileged processes (whose effective UID is nonzero). Privileged processes bypass all kernel permission checks, while unprivileged processes are subject to full permission checking based on the process's credentials (usually: effective UID, effective GID, and supplementary group list).
Starting with kernel 2.2, Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled. Capabilities are a per-thread attribute.
@ainar-g & @EugeneOne1 Code review please?
@yscialom, thanks for the contribution and sorry for a late response. The PR is seemingly ok, but we'd like to extensively test it on different devices before merging.
BTW, your code introduces not a lot of changes and is fully compliant with our guildelines. Here they are in case you interested.
@yscialom, hello again. A couple of thigs to clarify about the implementation:
- Does it fix issue 4714 for your setup?
- Does it work with no capability manipulations at all?
The PR is really good at stating the problem but it seems our current implementation requires a different approach. We suggest:
- Put raising the permissions into a separate function in the same file
internal/aghnet/net_linux.go
. It should return the error if the capabilities couldn't be raised. For the rest OSes it should always returnnil
. Wrap it with exported function (e.g.AcquirePermissions
) in some OS-independent file (internal/aghnet/net.go
would be right). This is a widespread pattern of wrapping the OS-dependent code, check theaghnet.CanBindPrivilegedPorts
for an example. - Call the new exported function in the
home.checkPermissions
just above theaghnet.CanBindPrivilegedPorts
call. If it fails, the error should be printed to theInfo
log and be followed by theaghnet.CanBindPrivilegedPorts
. The success makes calling theaghnet.CanBindPrivilegedPorts
redundant.
Also, I believe the check for having the net_bind_service
capability should be rewritten to:
res, err := unix.PrctlRetInt(
unix.PR_CAPBSET_READ,
unix.CAP_NET_BIND_SERVICE,
0,
0,
0,
)
What do you think about the solution?
Hello @EugeneOne1, sorry for the delay, I'm coming back from holidays.
Does it fix issue 4714 for your setup?
Yes it does.
Does it work with no capability manipulations at all?
Shame on me, I didn't even try. Though I suppose it does, with high level of confidence. Unfortunatly, I'm not in a position where I can build AGH right now, unless you can provide me with a docker image to do so.
What do you think about the solution?
I agree it is a net (pun intended) improvement.
Regarding the two improvements you suggested on the code structural organisation, I cannot give any advice on them: I'm not a Go developper and I'm unfamiliar with your codebase.
With that being said, how can I help you going forward?
Here is my endgoal: a PR to allow the default docker image of AGH to be runnable as a non-root user by simply defining the environment variable PUID
(à la linuxserver.io).
@EugeneOne1 Hello, what is the status of this PR? I understand your suggestions but also I am unable to implement them. Would you rather settle for a suboptimal & working solution or for a perfect one only you could code?
@yscialom, hello and sorry for the delay. The developers team is unfortunately busy for the moment, but we're keeping this PR opened to merge it in future with our suggestions reconsidered and implemented. We'll mention you here and in the release notes, when it'll be merged. Thank you for the contribution.
No worries. I understand this is not high priority, and to be honnest I'm happy to see it kept for later merge. Keep the good work there!
Any plan to merge this fix soon?
@yscialom, hello and apologies for late response. We've pushed the modified version of this PR to the separate branch. Could you please try to build AdGuard Home from the latest commit (3918789ca72dae146064c0668983763b33abe10f) and tell if it works in the described scenario? If it doesn't, we'd also like to check out verbose logs.
By the way, which purpose does entrypoint.sh
file serve in the referenced PR?