tracee icon indicating copy to clipboard operation
tracee copied to clipboard

[FEAT] redesign network events and capture features

Open rafaeldtinoco opened this issue 3 years ago • 0 comments

Prerequisites

  • [x] This issue is an EPIC issue (add label: EPIC).

Select one OR another:

  • [x] I'll create a PR to implement this feature (assign to yourself).

Feature description

Currently tracee uses different hook points and mechanisms for its network features.

Network events:

  • security_socket_create (perf submit: family, type, protocol, kern)
  • security_socket_listen (perf submit: sockfd, sockaddr, backlog)
  • security_socket_connect (perf submit: sockfd, sockaddr)
  • security_socket_accept (perf submit: sockfd, sockaddr) also save args to raw tracepoint syscall accept
  • security_socket_bind (perf submit: sockfd, sockaddr)
  • security_socket_setsockopt (perf submit: sockfd, sockaddr)

Hooks meant to update/delete entries from network map:

  • kprobe security_socket_bind (UPDATE network map)

  • kprobe udp_sendmsg (UPDATE network map)

  • kprobe tcp_connect (UPDATE network map)

  • kprobe pingv4_sendmsg (UPDATE network map)

  • kprobe pingv6_sendmsg (UPDATE network map)

  • raw tp inet_sock_set_state (UPDATE or DELETE network map)

  • kprobe udp_disconnect (DELETE network_map)

  • kprobe udp_destroy_sock (DELETE network_map)

  • kprobe udpv6_destroy_sock (DELETE network map)

  • kprobe icmp_send (DELETE network_map)

  • kprobe icmp6_send (DELETE network_map)

  • kprobe icmp_recv (DELETE network_map)

  • kprobe icmpv6_recv (DELETE network_map)

The current mechanism don't keep cgroup and/or network namespace for each mapped flow, making it difficult or impossible not to have conflicts in between different namespaces. Example: if one socket is bound to 0.0.0.0:8080 in one netns and another socket is bound to 0.0.0.0:8080 in another netns, it would be impossible to differentiate to which socket the ingress packets are supposed to be related to.

There is also another problem. Currently all the network events and captures are made with 2 TC progs attached to a specific network interface. Unfortunately this is sup-optimal since the interface being captured might not have the correct addresses (as network address translations might be in place for the workload of the task being traced).

Same applies to network derived events such as:

  • net_packet
  • dns_request
  • dns_response

There are many issues opened that might be related to this change, such as:

  1. [BUG] Entire clsact qdisc is purged when tc hook is destroyed https://github.com/aquasecurity/tracee/issues/1828
  2. [FEAT] network events should also be enriched https://github.com/aquasecurity/tracee/issues/1922
  3. [FEAT] Add MPTCP support https://github.com/aquasecurity/tracee/issues/2068
  4. [RFE] pcap capturing options https://github.com/aquasecurity/tracee/issues/2096
  5. [BUG] container folder is not getting created under /tmp/tracee https://github.com/aquasecurity/tracee/issues/2126

With that, the perfect solution for network events and capturing features should include, but not be limited to, the following:

  • be capable of relating egress/ingress packets to specific host tasks
  • not to rely in specific network interfaces for attachments
  • be close to processes (after mangling and translations)
  • allow near future flow enforcements through eBPF
  • work in older kernels (not only in recent ones)

rafaeldtinoco avatar Sep 06 '22 17:09 rafaeldtinoco