falco icon indicating copy to clipboard operation
falco copied to clipboard

[Feedback] New `driver-loader`

Open incertum opened this issue 1 year ago • 34 comments

Motivation

Share early feedback and improvement suggestion for the new driver-loader leading up to the Falco 0.37.0 release.

incertum avatar Dec 19 '23 00:12 incertum

Suggesting some Kubernetes deployment templates updates here https://github.com/falcosecurity/deploy-kubernetes/issues/95

re https://github.com/falcosecurity/falco/blob/master/docker/driver-loader/docker-entrypoint.sh

Suggesting to add some preconditions checks, similar to this VM testing dependency check script.

  • For example if --download is enabled and pointing to the internet first check if you even can. This is relevant for Kubernetes deployments where direct internet access can be restricted. Same check applies if you point to an internal URL -> print clear error messages that requirements for this option are not met. That way we disentangle native error messages in a clearer way.
  • If --compile is defined check if all required mounts are there, e.g. /usr/src/kernels/ and /lib/modules etc and similarly print a clear error message that the driver cannot be compiled, because xyz is missing. For kmod need DKMS package right? Check this. For example when you try the driver-loader with minikube you need minikube start --mount --mount-string="/usr/src:/usr/src" --driver=docker as some host mounts are otherwise not properly mounted (this limitations is not true for real Kubernetes cluster). Give some tips,e.g. for Kubernetes HOST_ROOT=/host env typically needs to be there.

@LucaGuerra @FedeDP @leogr @maxgio92

incertum avatar Dec 19 '23 00:12 incertum

cc @alacuku

leogr avatar Dec 19 '23 08:12 leogr

When I started looking at https://fluxcd.io/flux/get-started/ for the CNCF TAG testing I really loved the flux check --pre -- could be interesting not just for the driver-loader, but also the Falco binary itself, in a way complementing --dry-run?

incertum avatar Dec 21 '23 21:12 incertum

I really loved the flux check --pre

That's very nice actually, might be a good idea for the driver-loader, for sure! Also regarding https://github.com/falcosecurity/falco/issues/2978#issuecomment-1861902579, i agree it would be great to improve our checks giving better error messages!

FedeDP avatar Dec 22 '23 09:12 FedeDP

I see these suggestions as a good set of improvements @incertum. They totally make sense IMHO

maxgio92 avatar Dec 22 '23 19:12 maxgio92

See https://github.com/falcosecurity/cncf-green-review-testing/pull/6/files#r1441219090 seems we have a small regression.

incertum avatar Jan 04 '24 02:01 incertum

On a different node for very recent kernels the driver-loader container doesn't seem to cut it anymore. For example a 6.6.7-200.fc39.x86_64 kernel that was compiled with gcc 13 complains about /lib/x86_64-linux-gnu/libc.so.6: version GLIBC_2.38 not found (required by scripts/mod/modpost). Also errors for kmod.

Some of you may already be aware, but the gcc 13 cutover is causing other issues for userspace as well wrt grpc and such. I see there is already a PR up in libs, but we may have more incompatibility issues for older glibc versions with the newer grpc. Anyway not wanting to digress, but sooner or later we will need to face these issues.

incertum avatar Jan 04 '24 02:01 incertum

I've got a tracking issue on driverkit about the glibc issue: https://github.com/falcosecurity/driverkit/issues/303 In driverkit, we might need to add a new builder image. For Falco driver-loader image instead, there is not much we can do. Also cc @LucaGuerra for visibility, since he drove the new Falco driver-loader image initiative.

FedeDP avatar Jan 04 '24 09:01 FedeDP

Re the discussion here https://github.com/falcosecurity/cncf-green-review-testing/pull/6#discussion_r1447012757

When using minikube I get these error messages:

root@falco-driver-ebpf-5f79f:/# /usr/bin/falcoctl driver config --type ebpf && /usr/bin/falcoctl driver install --compile=true --download=false
2024-01-10 05:53:06 INFO  Running falcoctl driver config
                      ├ name: falco
                      ├ version: 7.0.0+driver
                      ├ type: ebpf
                      ├ host-root: /host
                      └ repos: https://download.falco.org/driver
2024-01-10 05:53:06 INFO  Running falcoctl driver install
                      ├ driver version: 7.0.0+driver
                      ├ driver type: ebpf
                      ├ driver name: falco
                      ├ compile: true
                      ├ download: false
                      ├ arch: x86_64
                      ├ kernel release: 6.2.15-100.fc36.x86_64
                      └ kernel version: #1 SMP PREEMPT_DYNAMIC Thu May 11 16:51:53 UTC 2023
2024-01-10 05:53:06 INFO  Found distro target: ubuntu-generic
2024-01-10 05:53:06 INFO  Removing eBPF probe symlink path: /root/.falco/falco-bpf.o                                                                                                                   
2024-01-10 05:53:06 INFO  Mounting debugfs for bpf driver.                                                                                                                                             
make: Entering directory '/usr/src/falco-7.0.0+driver/bpf'
expr: syntax error: unexpected argument '1'
make -C /lib/modules/6.2.15-100.fc36.x86_64/build M=$PWD
make[1]: Entering directory '/usr/src/falco-7.0.0+driver/bpf'
make[1]: *** /lib/modules/6.2.15-100.fc36.x86_64/build: No such file or directory.  Stop.
make[1]: Leaving directory '/usr/src/falco-7.0.0+driver/bpf'
make: *** [Makefile:39: all] Error 2
make: Leaving directory '/usr/src/falco-7.0.0+driver/bpf'
2024-01-10 05:53:06 ERROR failed: exit status 2 
root@falco-driver-ebpf-5f79f:/# 


root@falco-driver-ebpf-45swc:/# /usr/bin/falcoctl driver config --type ebpf && /usr/bin/falcoctl driver install --compile=true --download=false
2024-01-10 05:53:19 INFO  Running falcoctl driver config
                      ├ name: falco
                      ├ version: 7.0.0+driver
                      ├ type: ebpf
                      ├ host-root: /host
                      └ repos: https://download.falco.org/driver
2024-01-10 05:53:19 INFO  Running falcoctl driver install
                      ├ driver version: 7.0.0+driver
                      ├ driver type: ebpf
                      ├ driver name: falco
                      ├ compile: true
                      ├ download: false
                      ├ arch: x86_64
                      ├ kernel release: 5.15.0-83-generic
                      └ kernel version: #92-Ubuntu SMP Mon Aug 14 09:30:42 UTC 2023
2024-01-10 05:53:19 INFO  Found distro target: ubuntu-generic
2024-01-10 05:53:19 INFO  Removing eBPF probe symlink path: /root/.falco/falco-bpf.o                                                                                                                   
2024-01-10 05:53:19 INFO  Mounting debugfs for bpf driver.                                                                                                                                             
make: Entering directory '/usr/src/falco-7.0.0+driver/bpf'
expr: syntax error: unexpected argument '1'
make -C /lib/modules/5.15.0-83-generic/build M=$PWD
make[1]: Entering directory '/usr/src/falco-7.0.0+driver/bpf'
make[1]: *** /lib/modules/5.15.0-83-generic/build: No such file or directory.  Stop.
make[1]: Leaving directory '/usr/src/falco-7.0.0+driver/bpf'
make: *** [Makefile:39: all] Error 2
make: Leaving directory '/usr/src/falco-7.0.0+driver/bpf'
2024-01-10 05:53:19 ERROR failed: exit status 2 
root@falco-driver-ebpf-45swc:/# 

Maybe the start command needs to be even further adjusted?

minikube start --mount --mount-string="/usr/src:/usr/src" --mount --mount-string="/dev:/dev" --driver=docker

Perhaps it would be good just to confirm that we don't have these issues on regular Kubernetes when trying to build the driver from source?

incertum avatar Jan 10 '24 08:01 incertum

As far as I know, minikube does not ship kernel headers. That's the reason why we build the kernel module and ebpf probe. Here are the official docs on how to use falco on minikube: https://falco.org/docs/install-operate/third-party/learning/#minikube

It seems that the kernel headers are missing so it's not gonna work.

Another thing I see from the logs, is that the minikube has been started using the docker driver and not a fully-fledged VM.

So, I would say that it's not a limitation of the falco-driver-loader but a misconfiguration in the testing environment.

alacuku avatar Jan 10 '24 08:01 alacuku

Maybe the start command needs to be even further adjusted?

minikube start --mount --mount-string="/usr/src:/usr/src" --mount --mount-string="/dev:/dev" --driver=docker

I don't believe this minikube approach can generally work. Let me explain. With minkube, we can have two possible approaches:

  1. fully-fledged VM (default minikube behavior): as Aldo said, minikube does not ship kernel headers, so there's no way to build the drivers on the fly. This is why we provide prebuilt drivers for it, but of course, it's a best-effort approach (and may not be immediately available when a new minikube version comes out).

  2. --driver=docker: in this case, minikube works similarly to docker, so building a driver in this environment is equivalent to building a driver using docker. This comes with several requirements (which are implicitly documented by the Running with docker documentation:

    • the driver loader needs access to /usr, but also /lib/modules, /etc, and (in some cases) /boot and /proc too
    • HOST_ROOT=/host must be used (this will ensure host mounts won't overwrite those in the container images, so avoiding any conflicts
    • still, /usr/src in the container must be symlinked to the host one (this is what the docker entrypoint of the driver loader would do automatically)

One must satisfy all those requirements to make it work. Otherwise, it can work only in edge cases (for example, if the container image OS is precisely the one in the host, there are chances it can work even without mounting all those paths). Thus, I strongly recommend not using this latter approach (ie --driver=docker) because it is too fragile and not officially supported for the minikube use case.

Perhaps it would be good just to confirm that we don't have these issues on regular Kubernetes when trying to build the driver from source?

AFAIK we don't have this issue on a regular Kubernetes installation. In any case, it wouldn't be comparable since regular K8s installation is an entirely different world (compared to Minikube) from a driver's point of view.

I wanted to provide this long explanation, in the hope of helping clarify these subtle technical details.

leogr avatar Jan 10 '24 14:01 leogr

Understood @alacuku and @leogr that we don't support minikube in that way.

Just for the records, it's still possible to actually build both eBPF and kmod even when running minikube with a docker driver assuming you have the host kernel sources mounted. Linking to it here just in case someone is interested in seeing one possible way. Better would be building the driver via just passing the directory containing the kernel sources or headers directly to the make command (bypassing the default search in /lib/modules/...), see for example the test VM setup and approach https://github.com/falcosecurity/libs/blob/master/driver/bpf/Makefile#L16 -> pass the KERNELDIR to the make command https://github.com/falcosecurity/libs/blob/99aa7d2161a65f03b4aab16382af92fe7e067e81/test/vm/scripts/compile_drivers.sh#L75-L76. But again this doesn't mean driver-loader plans to support such approaches.

Re the CNCF testbed configs, I'll check what type of minikube config we want to support at the end as that setup is regarded separate from the default example Kubernetes template anyways.

incertum avatar Jan 11 '24 00:01 incertum

Just for the records, it's still possible to actually build both eBPF and kmod even when running minikube with a docker driver assuming you have the host kernel sources mounted.

It may be (I haven't tested), but I'm pretty sure that it comes with some caveats.

I guess that just mounting /lib/modules/.... is not enough. By chance, it may work if the OS in the host is compatible with that in the container image (it might be just a coincidence).

Otherwise, even passing the KERNELDIR wouldn't be enough. In any case, if all required mount points (/lib/modules, /usr, /etc, /boot) are configured properly in conjunction with HOST_ROOT=/host, passing KERNELDIR is not needed.

That being said, I agree that some workaround (event not officially supported) is ok for the purpose of the CNCF testbed configs. Since, in the end, this is not a driver loader issue.

leogr avatar Jan 11 '24 09:01 leogr

More feedback -> Falco 0.38 dev cycle:

@Andreagit97 @alacuku if we want to consider modern_ebpf as default, also for the helm chart, I believe we need to add modern_ebpf as driver to the driver-loader and noop if all conditions are met, else fall back to other drivers. Wrt the helm chart: Even if we set modern_ebpf as driver in the Falco container the pod can still crash and never come up in cases where a pre-built kmod was not found etc. Effectively we may not make onboarding easier.

incertum avatar Jan 22 '24 18:01 incertum

@Andreagit97 @alacuku if we want to consider modern_ebpf as default, also for the helm chart, I believe we need to add modern_ebpf as driver to the driver-loader and noop if all conditions are met, else fall back to other drivers. Wrt the helm chart: Even if we set modern_ebpf as driver in the Falco container the pod can still crash and never come up in cases where a pre-built kmod was not found etc. Effectively we may not make onboarding easier.

I would like to consider modern_ebpf as the default, too. If we implement a simple fallback mechanism in falcoctl (i.e. the driver loader), we can reuse it across all installation methods (helm chart included). Ideally, it would be nice if we allowed users to choose the fallback order (it shouldn't be so difficult to implement).

leogr avatar Jan 23 '24 08:01 leogr

If we implement a simple fallback mechanism in falcoctl

Falcoctl already supports modern_ebpf driver: https://github.com/falcosecurity/falcoctl/blob/main/pkg/driver/type/modernbpf.go Moreover, falcoctl already has some driver automatic chose logic (https://github.com/falcosecurity/falcoctl/blob/main/cmd/driver/driver_linux.go#L133 and https://github.com/falcosecurity/falcoctl/blob/main/pkg/driver/distro/generic.go#L63. Right now it is super simple, but can be extended easily.

If we aim to be able to run falco-driver-loader images (legacy and new one) with the modern_ebpf driver, what we need is a simple change in the docker-entrypoint.sh to also support modern_ebpf. Same goes for the Falco image: https://github.com/falcosecurity/falco/blob/master/docker/falco/docker-entrypoint.sh#L65

FedeDP avatar Jan 23 '24 08:01 FedeDP

Oh nice agreed @leogr and @FedeDP! Also nice that falcoctl already has everything in place and we just need minor adjustments to the container entrypoint 🚀

Ideally, it would be nice if we allowed users to choose the fallback order (it shouldn't be so difficult to implement).

Big +1 @leogr

incertum avatar Jan 23 '24 17:01 incertum

Ideally, it would be nice if we allowed users to choose the fallback order (it shouldn't be so difficult to implement).

Big +1 @leogr

Let's focus on this for 0.38 :+1:

leogr avatar Jan 23 '24 17:01 leogr

I will move this to milestone 0.38.0 since it will be addressed for next release now. /milestone 0.38.0

FedeDP avatar Jan 26 '24 09:01 FedeDP

Hi, one of Flatcar maintainers here.

Driver loader 0.37.0 is broken for Flatcar - we have pinned 0.36.2 for now in our mantle tests (https://github.com/flatcar/mantle/pull/496). The breakage came up when I was trying update glibc in Flatcar to 2.38. After a short investigation, I'm thinking that building kmod has no chance of succeeding, because it's using bash and gcc, which are missing in the Alpine Linux-based driver loader image. For Flatcar, patchelf is also missing. I'm going to file a PR that updates the bash script that does the relocations.

I'm probably missing something. Maybe we should rather use driver-loader-legacy image, which is still based on Debian. Maybe we should update our mantle test to use a different type of a driver. Any advice? Thanks!

krnowak avatar Feb 07 '24 11:02 krnowak

@krnowak thanks for the feedback!

After a short investigation, I'm thinking that building kmod has no chance of succeeding, because it's using bash and gcc, which are missing in the Alpine Linux-based driver loader image

Is this an image built by you? We don't provide any alpine driver-loader image.

Maybe we should rather use driver-loader-legacy image, which is still based on Debian.

Driver-loader new image is based on debian:bookworm: https://github.com/falcosecurity/falco/blob/master/docker/falco/Dockerfile#L1; see:

docker run -ti --rm --entrypoint bash falcosecurity/falco-driver-loader:0.37.0
root@b03d16763199:/# cat /etc/-os
cat: /etc/-os: No such file or directory
root@b03d16763199:/# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

For Flatcar, patchelf is also missing

Again, this is surprising given that the dep is present in our image: https://github.com/falcosecurity/falco/blob/master/docker/falco/Dockerfile#L40

FedeDP avatar Feb 07 '24 12:02 FedeDP

@krnowak thanks for the feedback!

After a short investigation, I'm thinking that building kmod has no chance of succeeding, because it's using bash and gcc, which are missing in the Alpine Linux-based driver loader image

Is this an image built by you? We don't provide any alpine driver-loader image.

Ah. Err… Yeah, I was looking at https://github.com/falcosecurity/falco/blob/master/docker/driver-loader/Dockerfile and it seemed to me that it makes a new driver-loader image basing on falcoctl image with a different entrypoint and some env vars predefined. But looks like it's actually supposed to use falco image. Thanks for clarifying, dunno where I got this idea from.

So the testing I did was basically by running make docker in falcoctl repo, transferring the image tarball to my Flatcar VM, running it with a changed entrypoint to /bin/sh, and running falcoctl driver --version 7.0.0 --host-root=/host install --compile=true --download=false inside it.

Looks like I got things wrong. Sorry about that.

But the failure I was trying to investigate was actually happening with falcosecurity/falco-driver-loader:master which explains why it worked before but started breaking after glibc version bump.

Which leads me to one question: how can make changes in falcoctl and create a driver-loader image locally to test it?

Maybe we should rather use driver-loader-legacy image, which is still based on Debian.

Driver-loader new image is based on debian:bookworm: https://github.com/falcosecurity/falco/blob/master/docker/falco/Dockerfile#L1; see:

docker run -ti --rm --entrypoint bash falcosecurity/falco-driver-loader:0.37.0
root@b03d16763199:/# cat /etc/-os
cat: /etc/-os: No such file or directory
root@b03d16763199:/# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

For Flatcar, patchelf is also missing

Again, this is surprising given that the dep is present in our image: https://github.com/falcosecurity/falco/blob/master/docker/falco/Dockerfile#L40

krnowak avatar Feb 07 '24 12:02 krnowak

But the failure I was trying to investigate was actually happening with falcosecurity/falco-driver-loader:master which explains why it worked before but started breaking after glibc version bump.

Now i got it :) You can try to use falco-driver-loader-legacy image! We bumped falco-driver-loader image to a newer debian version because the legacy driver loader is no more able to build most recent kernels (ie: 5.x and above). But the legacy was kept for this exact reason.

Which leads me to one question: how can make changes in falcoctl and create a driver-loader image locally to test it?

What i did during the dev phase was to compile a falcoctl binary with local changes, and docker cp it inside a falco-driver-loader running container under /usr/bin. Something among the lines of:

make falcoctl
docker cp falcoctl $(docker ps -lq):/usr/bin
docker commit --change='ENTRYPOINT ["/docker-entrypoint.sh"]' $(docker ps -lq) fededp/falco-driver-loader:0.37.0-mypatch
docker push fededp/falco-driver-loader:0.37.0-mypatch

FedeDP avatar Feb 07 '24 12:02 FedeDP

But the failure I was trying to investigate was actually happening with falcosecurity/falco-driver-loader:master which explains why it worked before but started breaking after glibc version bump.

Now i got it :) You can try to use falco-driver-loader-legacy image! We bumped falco-driver-loader image to a newer debian version because the legacy driver loader is no more able to build most recent kernels (ie: 5.x and above). But the legacy was kept for this exact reason.

Kernel versions in supported Flatcar releases are 5.15.x, 6.1.x and 6.6.x so probably a legacy loader is no go anyway.

Which leads me to one question: how can make changes in falcoctl and create a driver-loader image locally to test it?

What i did during the dev phase was to compile a falcoctl binary with local changes, and docker cp it inside a falco-driver-loader running container under /usr/bin. Something among the lines of:

make falcoctl
docker cp falcoctl $(docker ps -lq):/usr/bin
docker commit --change='ENTRYPOINT ["/docker-entrypoint.sh"]' $(docker ps -lq) fededp/falco-driver-loader:0.37.0-mypatch
docker push fededp/falco-driver-loader:0.37.0-mypatch

Aha, will try it out. Thanks!

krnowak avatar Feb 07 '24 12:02 krnowak

Kernel versions in supported Flatcar releases are 5.15.x, 6.1.x and 6.6.x so probably a legacy loader is no go anyway.

@krnowak just curiosity, have you ever tried to use the modern_ebpf driver since it seems that Flatcar supports high kernel versions (>=5.8)?

Andreagit97 avatar Feb 07 '24 15:02 Andreagit97

@Andreagit97: Not really. Maybe it's something we could try out in our tests. From the code I assume it doesn't really need any special preparations other than just having a relatively recent kernel with probably some specific config items enabled.

krnowak avatar Feb 07 '24 17:02 krnowak

From the code I assume it doesn't really need any special preparations other than just having a relatively recent kernel with probably some specific config items enabled.

Yep, the modern ebpf is shipped directly inside the Falco binary so you don't need the driver-loader/falcoctl you can just run Falco without downloading anything. The requirements on the system are the followings: https://falco.org/docs/event-sources/kernel/#requirements but a kernel like 5.15.x should satisfy all of them without further configurations :crossed_fingers:

Andreagit97 avatar Feb 07 '24 17:02 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar May 07 '24 21:05 poiana

/remove-lifecycle stale

leogr avatar May 08 '24 09:05 leogr

Moving to TBD! We currently lack the time to work on this specific driver-loader improvements (even if we have more in the pipeline for it!). Anybody wants to step up?

Also, there was little-to-no noise about the new falcoctl driver loader in eg: the Falco slack channel, therefore i assume not many people got bitten by new issues; this does not mean that we cannot further improve the situation though, but we can make it a relatively low priority task.

/milestone TBD

FedeDP avatar May 29 '24 12:05 FedeDP