tracee icon indicating copy to clipboard operation
tracee copied to clipboard

integration tests seem incompatible with systemd based init systems

Open 06kellyjac opened this issue 2 years ago • 5 comments

Prerequisites

  • [X] I checked the documentation and found no answer.
  • [X] There isn't an issue describing the bug.

Bug description

The integration tests run init q which is incompatible with systemd as an init system.

It'd be great if the test was compatible with systemd init or at least checked for a compatible init or otherwise skipped the test

Steps to reproduce

Please provide detailed steps for reproducing the bug.

  1. run integration tests on a system with systemd init

(Or if you want to use nix in a container)

$ docker run --pid=host --cgroupns=host --privileged -it nixos/nix

echo "system-features = nixos-test benchmark big-parallel kvm" >> /etc/nix/nix.conf
git clone https://github.com/06kellyjac/nixpkgs --branch tracee-integration --depth 1
nix-build nixpkgs/pkgs/tools/security/tracee/test.nix
  1. you get the following error...
machine: output: === RUN   Test_Events
=== RUN   Test_Events/do_a_file_write
    integration_test.go:46: running command: /run/current-system/sw/bin/tracee-ebpf --trace event=magic_write --output gotemplate=/tmp/do a file write-425935319
executing:  /run/current-system/sw/bin/cp /tmp/Test_MagicWrite-dir-2050388253/Test_MagicWrite-file-1669936023 /tmp/Test_MagicWrite-dir-2050388253Test_MagicWrite-file-1669936023-new
=== RUN   Test_Events/execute_a_command
    integration_test.go:46: running command: /run/current-system/sw/bin/tracee-ebpf --trace comm=ls --output gotemplate=/tmp/execute a command-1216458051
=== RUN   Test_Events/trace_new_pids
    integration_test.go:46: running command: /run/current-system/sw/bin/tracee-ebpf --trace pid=new --output gotemplate=/tmp/trace new pids-1598232219
=== RUN   Test_Events/trace_uid_0_with_comm_ls
    integration_test.go:46: running command: /run/current-system/sw/bin/tracee-ebpf --trace uid=0 --trace comm=ls --output gotemplate=/tmp/trace uid 0 with comm ls-2701955723
=== RUN   Test_Events/trace_pid_1
    integration_test.go:46: running command: /run/current-system/sw/bin/tracee-ebpf --trace pid=1 --output gotemplate=/tmp/trace pid 1-1855265802
    integration_test.go:157:
                Error Trace:    integration_test.go:157
                                                        integration_test.go:315
                Error:          Should NOT be empty, but was
                Test:           Test_Events/trace_pid_1
=== RUN   Test_Events/trace_only_execve_events_from_comm_ls
    integration_test.go:46: running command: /run/current-system/sw/bin/tracee-ebpf --trace event=execve --output gotemplate=/tmp/trace only execve events from comm ls-381753538
=== RUN   Test_Events/trace_filesystem_events_from_comm_ls
    integration_test.go:46: running command: /run/current-system/sw/bin/tracee-ebpf --trace s=fs --trace comm=ls --output gotemplate=/tmp/trace filesystem events from comm ls-540033629
--- FAIL: Test_Events (9.44s)
    --- PASS: Test_Events/do_a_file_write (1.02s)
    --- PASS: Test_Events/execute_a_command (1.04s)
    --- PASS: Test_Events/trace_new_pids (1.07s)
    --- PASS: Test_Events/trace_uid_0_with_comm_ls (1.07s)
    --- FAIL: Test_Events/trace_pid_1 (3.11s)
    --- PASS: Test_Events/trace_only_execve_events_from_comm_ls (1.06s)
    --- PASS: Test_Events/trace_filesystem_events_from_comm_ls (1.07s)
FAIL

This is the result of init q on a systemd init distro

λ init q
Excess arguments.

Context

Please provide any relevant information about your setup. This is important in case the issue is not reproducible except for under certain conditions.

  • Linux version: NixOS
  • Linux kernel version: 5.15.32
  • Tracee version (or commit id of your tree): v0.7.0
  • LLVM version: clang 13.0.1 ?
  • Golang version: go1.17.7

Additional Information

Part of packaging for nixpkgs: https://github.com/NixOS/nixpkgs/pull/163477

The hope is for the integration tests to help identify issues, especially after libbpf 1.0 if we move to use our copy of libbpf.a (context: https://github.com/NixOS/nixpkgs/pull/163477#discussion_r851618769)

06kellyjac avatar Apr 18 '22 20:04 06kellyjac

@06kellyjac Im facing way more problems than just this one. Integration tests seem to fail (many of them) currently (when attempted your reproducer). I'll ask you to provide a PR suggestion OR just ignore the testing on your distro side.

image

rafaeldtinoco avatar Apr 19 '22 19:04 rafaeldtinoco

Hmm, you might need to cd within the cloned nixpkgs directory.

Ill try it again soon


This is a silly question but just to check were you running this on a linux host with /sys/kernel/btf/vmlinux? Or possibly on a Mac via docker desktop

06kellyjac avatar Apr 19 '22 20:04 06kellyjac

Ok I ran the container method and it failed in both cases..

I just realized the container probably needs --pid=host --cgroupns=host --privileged (or at least --privileged) so I'm trying that now :facepalm:

Otherwise any systemd based init distro VM should work

Edit:

yep that did it

image

It'd probably be good to get some better debug logging in these tests so you know if tracee-ebpf is having trouble with "BPF CO-RE" or capabilities (CAP_SYS_RESOURCE/CAP_BPF+CAP_PERFMON/CAP_SYS_ADMIN/CAP_IPC_LOCK) or if there are problems with the commands that are being traced like ls, init q etc

06kellyjac avatar Apr 19 '22 21:04 06kellyjac

I just realized the container probably needs --pid=host --cgroupns=host --privileged (or at least --privileged) so I'm trying that now 🤦

Oh right =D haha, I didn't think of it (so used to our building/execution envs). Anyway, what do you think about suggesting a PR to address these issues ?

rafaeldtinoco avatar Apr 20 '22 12:04 rafaeldtinoco

I can probably write a PR to check if init q would even work and if not skip?

If I knew exactly what init q did I could probably find a systemd equivalent which would be the better fix Do you know which init the test was written for?

06kellyjac avatar Apr 20 '22 19:04 06kellyjac

@06kellyjac is this something that is still affecting you ?

rafaeldtinoco avatar Apr 03 '23 21:04 rafaeldtinoco

Its been a while. I'll update to latest and check back in on this

06kellyjac avatar Apr 03 '23 21:04 06kellyjac

Yep, backlog is big and we've been focusing in new features mostly. Thanks a lot for checking this.

rafaeldtinoco avatar Apr 04 '23 02:04 rafaeldtinoco

Yeah there's no pid 1 tests anymore but it's marked as TODO.

I'd recommend any kind of test that relies on pid 1 or any specific pid count use a container as the target for testing as in a container you can get full control of what processes run at what number & I don't really want integration tests prodding around with that stuff on my host system lol (although for the integration tests in nixpkgs they're ran in a vm :partying_face:)

The only outstanding work for this issue might be to add a note next to those TODOs about what I've written above but other than that I'm happy for you to close this

06kellyjac avatar Apr 04 '23 08:04 06kellyjac

I have added a comment in another issue that will take care of the integration tests (https://github.com/aquasecurity/tracee/issues/2681). So we can close this. Thanks for re-checking this after so long @06kellyjac

rafaeldtinoco avatar Apr 06 '23 16:04 rafaeldtinoco