collector icon indicating copy to clipboard operation
collector copied to clipboard

Need to build a module for Anthos on-prem kernel (5.4.0-1014-gkeop)

Open ssurovich opened this issue 1 year ago • 10 comments

The list of supported Kernels dont have anything that works with the appliances Google provides for Anthos on VMware - They always add the gkeop to the kernel, so it will never be found in the supported list.

Can I still use the manually build steps to make a compatible collector for my cluster?

ssurovich avatar Jun 06 '23 14:06 ssurovich

Hi @ssurovich, provided you have access to kernel headers for the nodes in your cluster, the manual build steps can be used to build drivers.

Alternatively, latest versions of collector now contain a built-in BTF enabled driver which may work on your cluster out-of-the-box. This can be enabled by setting the environment variable: COLLECTION_METHOD=core-bpf (collector version 3.15.0 or newer)

Stringy avatar Jun 07 '23 15:06 Stringy

Hi @Stringy - Thanks for the quick reply - Ill try setting that variable to see if it works.

ssurovich avatar Jun 07 '23 21:06 ssurovich

Ok, Im using a collector release of 4.0.2 and I have the arg set to core-bfp and still no luck. In the collector logs, it verifies that I did pass the collection-method (User configured collection-method=core_bpf)

It looks like its still tryiing to pull a module for the kernel itself, "attempting to download collector-ebpf-5.4.0-1054-gkeop.o"

It will be a challenge to create a module, all of my nodes in an air-gap environment and the nodes are appliances, making any additional software/modules a challenge to add.

Ive created Falco modules in the past, I just needed to pass access to /usr/src to the deployment - the linux-headers are in that directory on all nodes. Can I create a module using the included headers?

ssurovich avatar Jun 08 '23 13:06 ssurovich

Ok, Im using a collector release of 4.0.2 and I have the arg set to core-bfp and still no luck. In the collector logs, it verifies that I did pass the collection-method (User configured collection-method=core_bpf)

It looks like its still tryiing to pull a module for the kernel itself, "attempting to download collector-ebpf-5.4.0-1054-gkeop.o"

What exactly doesn't work, do you get any error message with core_bpf? The log messages could be a bit inconsistent, so it's better to verify the final result.

erthalion avatar Jun 08 '23 14:06 erthalion

Ah sorry, should have added more details.

It errors out when attempting to download the kernel object from https://sensor.stackrox.svc:443/kernel-objects/2.4.0/collector-ebpf-5.4.0-1054-gkeops.o.gz

Right after that line, a bunch of : Unexpected HTTP request failure (HTTP 500)

After 90 failures with the 500 errors:

No suitable kernel object downloaded for collector-ebpf-5.4.0-1054-gkeops.o.gz

In the diagnostics portion, it says the sensor is connected, under kernel driver candidates: CO.RE eBPG probe (available) and collector-ebpf-5.4.0-1054-gkeops.o.gz (unavailable)

Im not sure why it knows I have the core selected and that seems to be passing, but it still tries to get the kernel module that doesnt exist?

ssurovich avatar Jun 08 '23 16:06 ssurovich

This sounds strange indeed. Could you post the whole log output, ideally with the logLevel:debug in the Collector configuration?

erthalion avatar Jun 09 '23 07:06 erthalion

Many thanks, @Stringy , for the hint COLLECTION_METHOD=core-bpf! It saved my day, because Oracle Linux 9 arm64 'unbreakable enterprise' kernels aren't not included into KERNEL_VERSIONS either.

@ssurovich , collector release 4.0.2 isn't a thing. Try to use 3.15.0 branch. It worked for me:

Collector logs with collection-method=core-bpf
Collector Version: 3.15.0
OS: Oracle Linux Server 9.2
Kernel Version: 5.15.0-102.110.5.1.el9uek.aarch64
Starting StackRox Collector...
[INFO    2023/07/02 12:06:40] Hostname: 'dex'
[INFO    2023/07/02 12:06:40] User configured collection-method=core_bpf
[INFO    2023/07/02 12:06:40] Afterglow is enabled
[INFO    2023/07/02 12:06:40] Sensor configured at address: sensor.stackrox.svc:443
[INFO    2023/07/02 12:06:40] Attempting to connect to Sensor
[INFO    2023/07/02 12:06:49] Successfully connected to Sensor.
[INFO    2023/07/02 12:06:49] Module version: 2.5.0
[INFO    2023/07/02 12:06:49] Config: collection_method:core_bpf, useChiselCache:1, scrape_interval:30, turn_off_scrape:0, hostname:dex, processesListeningOnPorts:1, logLevel:INFO, set_import_users:0
[INFO    2023/07/02 12:06:49] Attempting to find eBPF probe - Candidate versions: 
[INFO    2023/07/02 12:06:49] CO.RE eBPF probe
[INFO    2023/07/02 12:06:49] collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o
[INFO    2023/07/02 12:06:50] 
[INFO    2023/07/02 12:06:50] This product uses ebpf subcomponents licensed under the GNU
[INFO    2023/07/02 12:06:50] GENERAL PURPOSE LICENSE Version 2 outlined in the /kernel-modules/LICENSE file.
[INFO    2023/07/02 12:06:50] Source code for the ebpf subcomponents is available at
[INFO    2023/07/02 12:06:50] https://github.com/stackrox/falcosecurity-libs/
[INFO    2023/07/02 12:06:50] 
[INFO    2023/07/02 12:06:50] 
[INFO    2023/07/02 12:06:50] == Collector Startup Diagnostics: ==
[INFO    2023/07/02 12:06:50]  Connected to Sensor?       true
[INFO    2023/07/02 12:06:50]  Kernel driver candidates:
[INFO    2023/07/02 12:06:50]    CO.RE eBPF probe (available)
[INFO    2023/07/02 12:06:50]  Driver loaded into kernel: CO.RE eBPF probe
[INFO    2023/07/02 12:06:50] ====================================
[INFO    2023/07/02 12:06:50] 
[INFO    2023/07/02 12:06:50] Network scrape interval set to 30 seconds
[INFO    2023/07/02 12:06:50] Waiting for Sensor to become ready ...
[INFO    2023/07/02 12:06:50] Sensor connectivity is successful
[INFO    2023/07/02 12:06:50] Started network status notifier.
[INFO    2023/07/02 12:06:50] Established network connection info stream.
[INFO    2023/07/02 12:06:50] Trying to establish GRPC stream for signals ...
[INFO    2023/07/02 12:06:50] Successfully established GRPC stream for signals.
[INFO    2023/07/02 12:06:50] Found self-check process event.
[INFO    2023/07/02 12:06:51] Found self-check connection event.
[INFO    2023/07/02 12:07:00] self-check (pid=77) exitted with status: 0

versus

Collector logs with collection-method=ebpf
Collector Version: 3.15.0
OS: Oracle Linux Server 9.2
Kernel Version: 5.15.0-102.110.5.1.el9uek.aarch64
Starting StackRox Collector...
[INFO    2023/07/01 21:52:16] Hostname: 'dex'
[INFO    2023/07/01 21:52:16] User configured collection-method=ebpf
[INFO    2023/07/01 21:52:16] Afterglow is enabled
[INFO    2023/07/01 21:52:16] Sensor configured at address: sensor.stackrox.svc:443
[INFO    2023/07/01 21:52:16] Attempting to connect to Sensor
[INFO    2023/07/01 21:52:16] Successfully connected to Sensor.
[INFO    2023/07/01 21:52:16] Module version: 2.5.0
[INFO    2023/07/01 21:52:16] Config: collection_method:ebpf, useChiselCache:1, scrape_interval:30, turn_off_scrape:0, hostname:dex, processesListeningOnPorts:1, logLevel:INFO, set_import_users:0
[INFO    2023/07/01 21:52:16] Attempting to find eBPF probe - Candidate versions: 
[INFO    2023/07/01 21:52:16] collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o
[INFO    2023/07/01 21:52:16] Attempting to download collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o
[INFO    2023/07/01 21:52:16] Attempting to download kernel object from https://sensor.stackrox.svc:443/kernel-objects/2.5.0/collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o.gz
[INFO    2023/07/01 21:52:16] HTTP Request failed with error code 404
[WARNING 2023/07/01 21:55:12] Attempted to download collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o.gz 36 time(s)
[WARNING 2023/07/01 21:55:12] Failed to download from collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o.gz
[WARNING 2023/07/01 21:55:12] Unable to download kernel object collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o to /module/collector-ebpf.o.gz
[WARNING 2023/07/01 21:55:12] No suitable kernel object downloaded for collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o
[ERROR   2023/07/01 21:55:12] Failed to initialize collector kernel components.
[INFO    2023/07/01 21:55:12] 
[INFO    2023/07/01 21:55:12] == Collector Startup Diagnostics: ==
[INFO    2023/07/01 21:55:12]  Connected to Sensor?       true
[INFO    2023/07/01 21:55:12]  Kernel driver candidates:
[INFO    2023/07/01 21:55:12]    collector-ebpf-5.15.0-102.110.5.1.el9uek.aarch64.o (unavailable)
[INFO    2023/07/01 21:55:12] ====================================
[INFO    2023/07/01 21:55:12] 
[FATAL   2023/07/01 21:55:12] Failed to initialize collector kernel components.

02fa avatar Jul 02 '23 15:07 02fa

Sounds like a plan - Willing to try whatever may allow scanning to complete :) Thanks for the heads up - Ill give 3.15.0 a try

ssurovich avatar Jul 02 '23 17:07 ssurovich

This sounds strange indeed. Could you post the whole log output, ideally with the logLevel:debug in the Collector configuration?

It a challenge to show any logs, all of my environments are air-gapped, no easy way to get any log information to post. Wish I could...

ssurovich avatar Jul 02 '23 17:07 ssurovich

Any updates from your side @ssurovich ?

porridge avatar Mar 28 '24 10:03 porridge

Hi @ssurovich It's been a while and perhaps this issue lost relevance. Ok if we close it or do you need some assistance with troubleshooting the setup?

msugakov avatar Oct 08 '24 16:10 msugakov