containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[Fargate] [request]: Provide the ability to use ebpf on fargate instances.

Open KnoxAnderson opened this issue 4 years ago • 17 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request Provide the ability to leverage ebpf for security and monitoring use cases on fargate

Which service(s) is this request for? ECS or EKS running on fargate

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

An eBPF program is "attached" to a designated code path in the kernel. When the code path is traversed, any attached eBPF programs are executed. Given its origin, eBPF is especially suited to writing network programs and it's possible to write programs that attach to a network socket to filter traffic, to classify traffic, and to run network classifier actions.

We'd want to attach eBPF programs to the following static tracepoints:

  • System call entry path
  • System call exit path
  • Process context switch
  • Process termination
  • Minor and major page faults
  • Process signal delivery

This allows the collection of -

  • Data associated to a network connection (e.g. TCP/UDP IPv4/IPv6 tuple, UNIX socket names, …).
  • Highly granular metrics about the process (memory counters, page faults, socket queue length, …).
  • Container-specific data, such as the cgroups the process issuing the system call belongs to, as well as the namespaces that process lives in.

Are you currently working around this issue? We are currently working around this issue by using ptrace which was exposed in fargate 1.4, but ebpf would be a more stable cross platform approach.

Additional context Anything else we should know?

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

KnoxAnderson avatar Aug 11 '20 21:08 KnoxAnderson

On our end: We are using eBPF to both collect performance data, but also for general metrics / observability and debugging mystery issues that arise. Proper support for eBPF in fargate would be great!

thomasdullien avatar Aug 30 '20 14:08 thomasdullien

We'd really like to see uprobe attachment supported as well. We instrument processes using eBPF attached to uprobes and we'd love to see this work with Fargate!

bhiggins avatar Sep 16 '20 04:09 bhiggins

I've been asked a number of times to provide tests to show that eBPF observability works in a given environment, including for containers. Here is my smallest set of tests (each testing something different):

bpftrace -e 'BEGIN { printf("hello world\n"); }'
bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s %s\n", comm, str(args->filename)); }'
bpftrace -e 'kprobe:vfs_read { @start[tid] = nsecs; @bsize[arg2]++; } kretprobe:vfs_read /@start[tid]/ { @ns[comm] = hist(nsecs - @start[tid]); delete(@start[tid]); }'
bpftrace -e 'ur:/bin/bash:readline { printf("got: %s\n", str(retval)); }'
bpftrace -e 'profile:hz:9 { @[kstack, ustack] = count(); }'

For more tests, I'd try the tools in https://github.com/iovisor/bcc and https://github.com/iovisor/bpftrace. Note that the bcc tools are evolving into the versions in the libbpf-tools directory, which produce C binaries with no dependencies (no LLVM) as they contain embedded BPF bytecode. Also note that I'm discussing observability only here.

Making everything work in containers securely (and, once the bpf and perf_event_open syscalls are allowed, avoiding memory reads of other container info) will be quite some work.

brendangregg avatar Aug 26 '21 05:08 brendangregg

Hi folks, please pardon the extended delay in response for this issue. This is something we're actively working on (we've updated the status to "Coming Soon" rather than "Proposed"). In an incoming release, Fargate will make it possible to add specific Linux capabilities to the task containers, including CAP_BPF and CAP_PERFMON. Fargate will exclusively support BPF CO-RE applications.

soo-o avatar Feb 17 '23 00:02 soo-o

In an incoming release, Fargate will make it possible to add specific Linux capabilities to the task containers, including CAP_BPF and CAP_PERFMON.

Thanks for the update @soo-o 🙏🏼 Do you already know the full list of capabilities that will be allowed?

I believe that since https://github.com/aws/containers-roadmap/issues/1000 stays open, and given the security considerations on Fargate, CAP_SYS_ADMIN is off the table?

inge4pres avatar Mar 01 '23 09:03 inge4pres

I'd be super stoked to see this functionality landed

NyanHelsing avatar Mar 06 '23 11:03 NyanHelsing

@soo-o just saw that this is coming - can't wait! If y'all need someone to test it at some point I'm down for it.

fntlnz avatar Mar 28 '23 08:03 fntlnz

Fargate will exclusively support BPF CO-RE applications.

@soo-o could you explain what this means? CO-RE can be done in user space (libbpf, cilium/ebpf) and in kernel space. Does this imply that CO-RE will have to be done in the kernel?

lmb avatar Mar 29 '23 15:03 lmb

Is there an ETA for this feature? We are highly interested in it since we plan to use Cilium to support network policies and want to run some of our workloads on Fargate.

jgoeres avatar Jul 06 '23 15:07 jgoeres

Is there an ETA to enable the use of eBPF on Fargate? It is currently blocking us to move onto Graviton for the task integrated with our security tool as the only way to read kernel level events in Graviton is eBPF (please correct if this understand is wrong).

rajeshkundwani avatar Aug 30 '23 23:08 rajeshkundwani

Is there any update to this?

reskin89 avatar May 03 '24 00:05 reskin89

Checking in on the progress of this

imreACTmd avatar Jun 11 '24 19:06 imreACTmd

I see it has been 4 years since this issue was opened. With Cilium being the defacto CNI these adays this is quite essential for customers to allow using Cilium in EKS clusters that also utilize Fargate. @AWS any status update on this?

marcofranssen avatar Jul 02 '24 10:07 marcofranssen

I would say just forget about this being implemented and switch to EC2, so you can use Cilium (have it replace kube-proxy and the aws-vpc-cni). EC2 has more advantages like EFS CSI driver dynamic provisioning, faster container startup times (making HPA actually useful), ability to use DaemonSets, etc.

sjoukedv avatar Jul 02 '24 10:07 sjoukedv

maybe only tangentially related.. but if windows supported something like ebpf maybe this crowdstrike thing wouldn't have happened. Just a thought maybe it's worth prioritizing wider ebpf support including in fargate.

NyanHelsing avatar Jul 20 '24 19:07 NyanHelsing

Lack of this feature - actually stops my company plans to use Elastic Agent integration - https://www.elastic.co/docs/current/en/integrations/cloud_defend and probably usage of other security tools for inpecting events on kubernetes containers.

And together with lack of working AWS GuardDuty Runtime Monitoring on FarGate backed EKS - for our use case (and probably a lot of others) it leaves Fargate EKS clusters without any security-oriented runtime monitoring.

t0mbombadil avatar Jul 22 '24 12:07 t0mbombadil

If we look at the containers-roadmap this was moved into - it seems they are very focused on EKS, not Fargate.

My guess is EKS is more popular, and more comparable to competitors offerings, so they must advance it more. (Might even retire Fargate if you ask me..)

So.. maybe don't hold your breath...

dror-g avatar Oct 15 '24 09:10 dror-g