task
task copied to clipboard
Add Landlock sandboxing support
Inspired by https://justine.lol/make I attempted to add basic Landlock support to Task. This will obviously only work on Linux and requires a decently recent kernel. This PR should serve as a ground for discussion if this features is something we might want and if my initial attempt is worth extending or if a different route of implementation should be chosen.
this is interesting 🤔
Hi @MDr164,
Sorry for not responding earlier.
I'm a bit out-of-the-loop here. Can you explain what is Landlock and why would be useful to have this integrated into Task?
Is this a tool that needs to be installed separately and it's builtin into the Linux kernel?
Landlock is a Linux kernel LSM where you can find the documentation for here: https://www.kernel.org/doc/html/latest/userspace-api/landlock.html
It allows a userspace process like Task to restrict itself, more details to that are outlined in the blog entry I linked in the PR text. It does not need an additional tool, it simply needs to be enable in the kernel which is the case on most distributions I tested. Though it is Linux specific and apparently needs CGO to compile against a kernel library. This PR would need refactoring for it to be able to go mainline as right now it would break on non-Linux systems.
Unfortunately requiring CGO is not ideal, indeed.
Unfortunately requiring CGO is not ideal, indeed.
Looks like it can also be used in pure-go mode but is then not as powerful as it uses psx from libcap. To quote the official documentation:
For pure Go programs, psx does the same as the syscall.AllThreadsSyscall function in the Go runtime (and that case is straightforward to understand). For programs linked with cgo, there can be more OS threads than just the ones that were started by the Go runtime.
I rebased the PR on master and added a build constraint so that the sandboxing will throw an error when attempted to be used outside of Linux. I'm not sure what the best way would be to expose this capability to users yet so I'll keep this PR as a draft until we can decide on that. A good thing to note: Even if the sandbox is turned on but the OS doesn't support it, we will simply not enable the sandbox instead of throwing an error. This could potentially promote the sandbox to be enabled all the time but only be active when enabled in the Kernel.
Any chance this could get a review (be it positive or negative)? This change allows Task to obtain the same amount of sandboxing as build systems like Bazel while maintaining performance. The conditional compilation makes it possible to only include this code on Linux builds. We can also think about adding sandboxing to other platforms though I'm most familiar with Linux. I'm using these patches sind quite some time now and would love not having to always rebase them against master :smiley:
@MDr164 I've tried to look into this a couple of times since you created the PR. Perhaps I can outline my current understanding and you can correct me if I'm wrong or missing something?
- This functionality essentially limits
task
's ability to access the global filesystem. Rather, it only grants access to a specific set of files during execution. i.e. A sandbox. - You are setting this list of allowed files to a task's defined
sources
andgenerates
lists (plus a couple of other standard directories like/tmp
and.task
.
My questions are:
- What is the aim of this functionality? Or rather, who/what is it intended to protect against? I can only assume that since the feature is behind a flag, that it is to protect the user from themselves rather than a full security feature since I could just add random files to my sources/generates or disable the flag and access any file. I think it would help me to understand this better if you could explain why you need this functionality in your project. What requirement are you fulfilling?
- We definitely can't enable cgo for functionality that effects such a small percentage of users. Task runs on a wide variety of systems and being able to run without system dependencies such as
glibc
ormusl
is important. I know that I've used Task inside an alpine image in CI before for example. I don't think we can change this. Is the lack of cgo a deal-breaker or would you be happy with this merged with cgo disabled? - What happens when a task depends on or calls another task that accesses different files? At the moment you only appear to be passing the
sources
andgenerates
of the called Task, but not any "child" tasks. - If I check out your branch and try to run something like
task --sandbox mod
on this repo, I get apermission denied
. I assume this is because I'm accessing/usr/lib/go/bin/go
. What is your proposal for allowing the user to call commands like this? Thego
binary is neither a source or a generated file, so adding it to one of those lists feels wrong.
Thanks for giving it a try!
@MDr164 I've tried to look into this a couple of times since you created the PR. Perhaps I can outline my current understanding and you can correct me if I'm wrong or missing something?
- This functionality essentially limits
task
's ability to access the global filesystem. Rather, it only grants access to a specific set of files during execution. i.e. A sandbox.
Correct.
- You are setting this list of allowed files to a task's defined
sources
andgenerates
lists (plus a couple of other standard directories like/tmp
and.task
.
Correct, we could also add a different global directive like allow-list to set the scope of the sandbox.
My questions are:
- What is the aim of this functionality? Or rather, who/what is it intended to protect against? I can only assume that since the feature is behind a flag, that it is to protect the user from themselves rather than a full security feature since I could just add random files to my sources/generates or disable the flag and access any file. I think it would help me to understand this better if you could explain why you need this functionality in your project. What requirement are you fulfilling?
I set it behind a flag for now as I didn't want users to wonder why they're suddenly getting access denied. The scope is fairly easy: If you wanna run an unknown task or have a rouge unit test that gets executed you're protecting your rootfs from changes. It's a lot easier to quickly check the scope of the sandbox by looking into the sources/generates instead of trying to understand giant Taskfiles with shell voodoo or look into every test/programm that gets executed. So yes, it kinda protects the user from the user running unknown, likely 3rdparty scripts/code.
- We definitely can't enable cgo for functionality that effects such a small percentage of users. Task runs on a wide variety of systems and being able to run without system dependencies such as
glibc
ormusl
is important. I know that I've used Task inside an alpine image in CI before for example. I don't think we can change this. Is the lack of cgo a deal-breaker or would you be happy with this merged with cgo disabled?
No problem at all. When CGO is disabled it falls back to the (albeit slower) pure Go implementation of the kernel interfaces. So that got resolved already :)
- What happens when a task depends on or calls another task that accesses different files? At the moment you only appear to be passing the
sources
andgenerates
of the called Task, but not any "child" tasks.
Those "child" tasks would inherit the scope of the sandbox from the parent. But you're right, I think we would need more robust parsing of the allowed paths from included tasks, I'll tackle that next!
- If I check out your branch and try to run something like
task --sandbox mod
on this repo, I get apermission denied
. I assume this is because I'm accessing/usr/lib/go/bin/go
. What is your proposal for allowing the user to call commands like this? Thego
binary is neither a source or a generated file, so adding it to one of those lists feels wrong.
Yes I was also planning on finding a better way to declare the scope, any preference? I don't want the user to invest too much effort when writing the sandbox rules but I also don't want to overblock paths of course.
Thanks for giving it a try!
No problem, I think I'm on the same page now. I'm generally pretty positive towards this and don't have any issues with the code fundamentals. I think it just needs a bit of polish now.
We could also add a different global directive like allow-list to set the scope of the sandbox
Since the standard terminology for this is "scoped access-control", I would suggest that something like scopes
would make sense as a keyword to define which files Task should have access to. By no means do I have final say on naming, but that's just my suggestion 😛 Short, concise and doesn't conflict with anything else.
I set it behind a flag for now as I didn't want users to wonder why they're suddenly getting access denied
Yes, this makes a lot of sense, we definitely can't enable this by default as it would break many (probably most?) Taskfiles. This is something we could perhaps revisit on a major version change, but I think this would be much more compelling if we had x-platform support. We'd also have to consider if this makes writing basic/personal configurations annoying.
Maybe another approach would be creating a TASK_SANDBOX
environment variable (or similar) that users can set in their shell environment. This would mean they don't have to add --sandbox
every time. We could even go further and add something like --no-sandbox
or --unsafe
which disables landlock if they've verified/trust the task.
It's a lot easier to quickly check the scope of the sandbox by looking into the sources/generates instead of trying to understand giant Taskfiles with shell voodoo or look into every test/programm that gets executed
This is the selling point for me. It lowers the bar to understand the implications of running a complex task.
When CGO is disabled it falls back to the (albeit slower) pure Go implementation
👍
We would need more robust parsing of the allowed paths from included tasks, I'll tackle that next!
This is probably the biggest blocker. This feature has no value if someone can hide their malicious/dangerous code in a dependant task. Feel free to ping me if/when you get around to fixing this or if you need any help.
Yes I was also planning on finding a better way to declare the scope, any preference?
I mentioned scopes
earlier. I think something like this would be nice :)
tasks:
default:
deps:
- mydeptask
sources:
- ./my/sources/**/*
generates:
- ./my/generated/files/**/*
scopes:
- /path/to/binfoo
- path: binbar
cmds:
- task: mychildtask
- /path/to/binfoo
- binbar
It also occurred to me after my last comment that user A might have Go installed in /usr/lib/go/bin/go
and user B might have it installed in /usr/local/go/bin/go
depending on distro and/or installation method. This is why I added the subkey path
under scopes
. This would do the equivalent of a which binbar
. i.e. It does a lookup for the binary in your $PATH
and adds the first result to the scopes.
Also, note the subtask and dependant task. The scopes of which would need to be merged into the parent.
@andreynering It would be good to get your very quick thoughts on all of this too. It would be a shame to spend any more time on the PR if you're unsure about adding this functionality.
My two cents on this...
This PR has only two up votes (👍) today, while many other issues have a lot more than this.
This makes me think that few users are interested in this at the moment, compared to other priorities.
Because of this, I'm inclined to categorize this feature request as a "maybe in the future".
If in the future we see more upvotes here, and hopefully also comments explaining why they think this is useful, we can reconsider.
Also, this would give us time to think better about the open questions: "what to do for OSes that don't support this", etc.