tetragon icon indicating copy to clipboard operation
tetragon copied to clipboard

kprobes: support having the same argument twice

Open kkourt opened this issue 7 months ago • 15 comments

If we write a policy with the same argument twice,

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "sys-lseek"
spec:
  kprobes:
  - call: "sys_lseek"
    syscall: true
    args:
    - index: 0
      type: "int"
      label: "index 0"
    - index: 0
      type: "int"
      label: "index 0 (again)"

The agent will accept it, and then produce a wrong event. For example, if we execute:

echo "100 100 100" | ./contrib/tester-progs/lseek-pipe

We will get:

[
  {
    "int_arg": 100,
    "label": "index 0"
  },
  {
    "int_arg": 0,
    "label": "index 0 (again)"
  }
]

And a warning in the logs:

time="2025-05-07T13:46:20+02:00" level=warning msg="Int type error" arg.usertype= error=EOF

Note that accessing the same argument twice is useful when used with "resolve:". For example:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "lsm"
spec:
  kprobes:
  - call: "security_bprm_check"
    syscall: false
    args:
      - index: 0
        type: "string"
        resolve: "mm.owner.comm"
        label: "proc"
      - index: 0
        type: "string"
        resolve: "mm.owner.real_parent.comm"
        label: "parent"
      - index: 0
        type: "string"
        resolve: "mm.owner.real_parent.real_parent.comm"
        label: "grand-parent"
    selectors:
      - matchActions:
        - action: Post

I'm marking this as a bug because we provide the wrong data to the user and we should (at minimum) reject the policy. The best solution, however, is to support multiple arguments on the same index (which, arguably, is a new feature).

See also: https://github.com/cilium/tetragon/issues/3710

kkourt avatar May 07 '25 11:05 kkourt

yes, with the resolve it makes sense now to allow to retrieve multiple argument values for single argument, which is new feature

but current code should fail for such spec now, will fix, thanks

olsajiri avatar May 07 '25 11:05 olsajiri

Hello ! Super glad this is part of the roadmap. I've been working on this feature few months ago, but it stayed as a WIP and was not ready for upstream. The final policy look like this. I acknowledge that "duplicateIndex" is not very user-friendly, but it allows an easier reuse in the matchArgs.

@olsajiri If you have not started development on this yet, I'll be happy to continue pushing it.

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "deny-flag-access"
spec:
  lsmhooks:
  - hook: "file_open"
    args:
    - index: 0
      type: "file"
    - index: 1
      type: "int"
      duplicateIndex: 0
      resolve: "f_mode"
    selectors:
    - matchActions:
      - action: Post
      - action: Override
        argError: -1
      matchArgs:
      - index: 0
        operator: "Postfix"
        values:
        - "flag.txt"
      - index: 1
        operator: "Mask"
        values:
          - "1" #FMODE_READ
          - "2" #FMODE_WRITE

tdaudi avatar May 07 '25 13:05 tdaudi

Hello ! Super glad this is part of the roadmap. I've been working on this feature few months ago, but it stayed as a WIP and was not ready for upstream. The final policy look like this. I acknowledge that "duplicateIndex" is not very user-friendly, but it allows an easier reuse in the matchArgs.

Cool!

I think the first thing we need to figure out is what the "index: " means. Does it mean the index in the function or does it mean the index in the arguments we attach to the events. See also https://github.com/cilium/tetragon/issues/3710.

I would argue that index (in the args:) should mean the index in the function, and the index in the arguments we attach should just be implicit by the order of the arguments we have. In that case, it should be possible to have the same argument twice.

kkourt avatar May 07 '25 14:05 kkourt

I would argue that index (in the args:) should mean the index in the function, and the index in the arguments we attach should just be implicit by the order of the arguments we have. In that case, it should be possible to have the same argument twice.

Yes, I like this idea. To do so, we can use the BTF file to build the first index, and use label to build an matchArg index implicitly.

The downside is that the resolve change is breaking change

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "lsm"
spec:
  kprobes:
  - call: "security_bprm_check"
    syscall: false
    args:
    - resolve: "bprm.mm.owner.comm"
      type: "string"
      label: "proc"
    - resolve: "bprm.mm.owner.real_parent.comm"
      type: "string"
      label: "parent"
    selectors:
      - matchActions:
        - action: Post
        matchArgs:
         - label: "proc"
           operator: "Equal"
           values:
           - "some process"
         - label: "parent"
           operator: "Equal"
           values:
           - "some parent"

To avoid multiple changes, we can use the policy args array index (proc label would be 0, parent would be 1, ...) as the actual index (from 1 to 5) to reference the label and "hide" this complexity to the user. And on bpf side, we add a new param to the event_config like this __u8 arg_idx[EVENT_CONFIG_MAX_ARG];. So Tetragon workflow will not change and iterate from 1 to 5, and we can easily compare things afterward.

tdaudi avatar May 07 '25 16:05 tdaudi

@olsajiri If you have not started development on this yet, I'll be happy to continue pushing it.

no problem with that, go ahead ;-)

I took the issue because I'd like to push some speed up changes I have in the queue for the argument processing and wanted to avoid conflicts.. would be great if you could wait for next week and base on my changes, but if not we'll figure it out anyway

thanks

olsajiri avatar May 07 '25 19:05 olsajiri

    - resolve: "bprm.mm.owner.real_parent.comm"
      type: "string"
      label: "parent"

I believe we still need the index to specify which of the function arguments we are referring to. Or am I missing something?

kkourt avatar May 08 '25 06:05 kkourt

We can get the argument index from the btf.FuncParam type. In the code below, we specify the index to get the BTF type, but we simply need to change this a bit and loop over all params that matches bprm and get the index.

https://github.com/cilium/tetragon/blob/0921cdf27bda30e19b1df94df339b80a2948be94/pkg/btf/btf.go#L133

The only thing I did not figure out yet is how I identify the label in args and bind this implicit index to use it next on the bpf side for matchArg

tdaudi avatar May 08 '25 08:05 tdaudi

We can get the argument index from the btf.FuncParam type. In the code below, we specify the index to get the BTF type, but we simply need to change this a bit and loop over all params that matches bprm and get the index.

tetragon/pkg/btf/btf.go

Line 133 in 0921cdf return &btfHookProto.Params[argIndex], nil

The only thing I did not figure out yet is how I identify the label in args and bind this implicit index to use it next on the bpf side for matchArg

I'm not sure if the type is enough to identify the argument. There could be two arguments with the same type no?

kkourt avatar May 08 '25 10:05 kkourt

I was more expecting to use the parameter name to identify the type because there can only have a single parameter with the same name.

int security_bprm_check(struct linux_binprm *bprm)

So in this case, the user will identify struct linux_binprm type by using bprm.

FuncParam have an attribute Name which is equal to bprm and an attribute Type which points to the btf type of struct linux_binprm https://pkg.go.dev/github.com/cilium/ebpf/internal/btf#FuncParam


But maybe the policy should look like this.

    - param: "bprm"
      resolve: "mm.owner.real_parent.comm"
      type: "string"
      label: "parent"

And the major downside would be that we should deprecate LSM usage for kernel < 5.4 because the hooks does not exist in the BTF file at this point. Or maybe we should ask the user to specify the index explicitly, in this case

tdaudi avatar May 08 '25 10:05 tdaudi

I was more expecting to use the parameter name to identify the type because there can only have a single parameter with the same name.

Ah I see. A potential problem here is that the parameter name can change across different kernel versions, and this btf files. That being said, I think it would be defintely useful to have a way to refer to function parameters by name.

I also see this somewhat orthogonal in supporting argument entries that refer to the same argument.

I think there are two different questions:

  • how do we refer to an argument in the elements of the args: list?
  • how do we refer to an argument in the matchArgs: section

In other words, in the args: section we need to refer to the function arguments, and in the selector we need to refer to the selector section arguments.

For example, we could refer to function arguments by index, and to selector section arguments by label:

    args:
      - index: 0             # first argument from the function
        type: "string"
        resolve: "mm.owner.comm"
        label: "proc"
      - index: 0             # first argument from the function
        type: "string"
        resolve: "mm.owner.real_parent.comm"
        label: "parent"
    selectors:
      - matchActions:
        - action: Post
        matchArgs:
         - label: "proc"         # match the argument from args: with label "proc"
           operator: "Equal"
           values:
           - "some process"
         - label: "parent"     # match the argument from args: with label "parent'
           operator: "Equal"
           values:
           - "some parent"

We could also refer to function arguments by index, and to selector section arguments by index:

    args:
      - index: 0             # first argument from the function
        type: "string"
        resolve: "mm.owner.comm"
        label: "proc"
      - index: 0             # first argument from the function
        type: "string"
        resolve: "mm.owner.real_parent.comm"
        label: "parent"
    selectors:
      - matchActions:
        - action: Post
        matchArgs:
         - index: 0         # first argument in the args section (proc)
           operator: "Equal"
           values:
           - "some process"
         - index: 1          # second argument in the args section (parent)
           operator: "Equal"
           values:
           - "some parent"

Since we already have args[].index and matchArgs.index, the above seems like a sane way to interpret the values that we have now in the spec. I'm not sure it breaks current behaviour, but I think it's worth considering as an option. Note, that this option does not exclude adding a way to refer to function parameters by parameter name in args:, and a way to refer to args: arguments by label in the matchArgs section.

kkourt avatar May 08 '25 11:05 kkourt

I was more expecting to use the parameter name to identify the type because there can only have a single parameter with the same name.

Ah I see. A potential problem here is that the parameter name can change across different kernel versions, and this btf files. That being said, I think it would be defintely useful to have a way to refer to function parameters by name.

I agree, but the resolve flag already comes with this kind of issue. For example, in struct dentry the attribute d_iname was removed between 6.13.13 and 6.14-rc1. That said, users who are familiar with kernel code and write policies based on it would probably expect to revisit those policies when updating the kernel. For instance, if a kprobe is renamed or removed, or if parameter indexes change. Maybe it could still be helpful to keep the args: index and let users choose whether they prefer referring to parameters by name or by position?

I also see this somewhat orthogonal in supporting argument entries that refer to the same argument.

I think there are two different questions:

* how do we refer to an argument in the elements of the `args:` list?

* how do we refer to an argument in the `matchArgs:` section

In other words, in the args: section we need to refer to the function arguments, and in the selector we need to refer to the selector section arguments.

[...] Since we already have args[].index and matchArgs.index, the above seems like a sane way to interpret the values that we have now in the spec. I'm not sure it breaks current behaviour, but I think it's worth considering as an option. Note, that this option does not exclude adding a way to refer to function parameters by parameter name in args:, and a way to refer to args: arguments by label in the matchArgs section.

The second option feels a bit confusing to me, as it requires users to distinguish between the parameter index and the position of the variable they defined in the args: list. Using a label seems more intuitive and makes the policy easier to read and reason about.

tdaudi avatar May 08 '25 12:05 tdaudi

Maybe it could still be helpful to keep the args: index and let users choose whether they prefer referring to parameters by name or by position?

Yap, I agree (and I guess this was where I was getting). Especially since we already have the index field there. There is no reason why we cannot support both.

Since we already have args[].index and matchArgs.index, the above seems like a sane way to interpret the values that we have now in the spec. I'm not sure it breaks current behaviour, but I think it's worth considering as an option. Note, that this option does not exclude adding a way to refer to function parameters by parameter name in args:, and a way to refer to args: arguments by label in the matchArgs section.

The second option feels a bit confusing to me, as it requires users to distinguish between the parameter index and the position of the variable they defined in the args: list. Using a label seems more intuitive and makes the policy easier to read and reason about.

The labels are easier to understand -- no arguments there. (Although, I'm not a huge fan the label name, but that's a different discussion).

But we already have the index field and we still need to have old policies keep working. Given that, having a way to reference arguments by index (in addition to referencing them by labels) seems worthwhile to me.

kkourt avatar May 08 '25 13:05 kkourt

We discussed this topic in yesterday's community call (May 11th). The general plan is as follows:

  1. start with a PR from @olsajiri that refactors the way arguments are accessed in the generic tracing code (e.g., generic kprobe arguments) to make the necessary changes easier.
  2. next, implement sane semantics using the index: fields. In args the index will refer to the function arguments, while in matchArgs the index will refer to what is specified in args.
  3. add support for referring to function arguments via parameter names, and adding labels to args so that we can refer them to matchArgs section without an index.

Here's an example to illustrate what we want:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "example"
spec:
  kprobes:
    # int security_capable(
    #   const struct cred *cred,   # fnarg0 
    #	struct user_namespace *ns, # fnarg1
    #	int cap,                   # fnarg2
    #	unsigned int opts)         # fnarg3
  - call: "security_capable"
    syscall: false
    args:
      # arg0 (cap)
      - index: 2 # cap
        type:  uint32
      # arg1 (cred->uid)
      - index: 0 # cred
        type: uint32
	resolve: "uid"
      # arg2 (cred->gid)
      - index: 0 # cred
        type: uint32
	resolve: "gid"
    selectors:
      - matchArgs:
        - index: 2  # match against arg2, cred->gid
	  opertor: "Equal"
	  values:
	  - "100"
        - index: 1  # match against arg1, cred->uid
	  opertor: "Equal"
	  values:
	  - "200"
        - index: 0  # match against arg0, cap
	  opertor: "Equal"
	  values:
	  - "39"

The example is intentionally contrived for clarity. From the bpf side, there are two parts:

  1. extracting the arguments:
     event_config->arg0[0] /* ->arg0 */  = cap
     event_config->arg0[1] /* ->arg1 */  = cred->uid
     event_config->arg0[2] /* ->arg2 */  = cred->gid

Above also defines the order of the arguments in the event.

  1. filter based on the arguments.
    1st filter: index=2: event_config->arg0[2] == 100
    2nd filter: index=1: event_config->arg0[1] == 200
    3nd filter: index=0: event_config->arg0[0] == 39

@olsajiri @tdaudi, I hope above captures the idea sufficiently. Let me know if you have any comments.

kkourt avatar May 13 '25 08:05 kkourt

@olsajiri @tdaudi I feel like we are discussing this in multiple places now (https://github.com/cilium/tetragon/pull/3725#discussion_r2087311836, https://github.com/cilium/tetragon/issues/3734#issuecomment-2878952880), so I think it makes sense to take a step back and see what the options are.

What the current kprobe code seems to do is set ->aX to the x argument of the function. That is, ->a0 points to the first argument of the function, e.g., https://github.com/cilium/tetragon/blob/c691fd53b8f932dec9e496b6191604445c9f1128/bpf/process/generic_calls.h#L232

Then, after we've done process filtering (e.g., matchPID, matchBinary) for every argument (max 5), we read the argument and then apply filtering in function order. As pointed out in https://github.com/cilium/tetragon/issues/3710, one of the issues of the current code is that because arguments are processed in function order, but we apply the arguments labels in spec order, the labels and the arguments are mismatched.

Assuming my understanding is correct, I think we need to answer two questions:

  • How do we order the arguments in the events? That is, are the arguments in ProcessKprobe ordered by spec order (order of arguments as defined in the spec) or function order (order of arguments as defined in the function).
  • How do we process the arguments in bpf? Do we process them in function order as we do now, or process them in spec order (e.g., @tdaudi suggests that ->a0 points to the first argument in the spec and not the first argument in the function: https://github.com/cilium/tetragon/issues/3734#issuecomment-2878952880)

I think either option is fine as long as:

  • Our results can be correctly decoded (https://github.com/cilium/tetragon/issues/3710)
  • We can support arguments with the same index (as discussed in this issue)
  • We can optimize things in the future where we do not do any unnecessary processing

FWIW, my opinion is to use the spec order in how we iterate arguments in bpf (and in the events). This is more intuitive to me, and it can potentially allow users to define a spec order that optimizes their use-case. I also think that's how I implemented the generic tracepoints.

kkourt avatar May 14 '25 08:05 kkourt

How do we order the arguments in the events? That is, are the arguments in ProcessKprobe ordered by spec order (order of arguments as defined in the spec) or function order (order of arguments as defined in the function).

my plan was to have spec order, so you'll have them in the event the way you specify them in the spec

How do we process the arguments in bpf? Do we process them in function order as we do now, or process them in spec order (e.g., @tdaudi suggests that ->a0 points to the first argument in the spec and not the first argument in the function: https://github.com/cilium/tetragon/issues/3734#issuecomment-2878952880)

that's what I intended to fix.. at the moment bpf code does tail call to process each function argument starting from 0 to 5, what I planned was to add 'index' to tracingapi.EventConfig to denote argument, and read only configured arguments, this way we omit the 'nop' tail calls that just spend cycles

olsajiri avatar May 14 '25 11:05 olsajiri

I believe this is fixed so closing.

Relevant PRs:

  • https://github.com/cilium/tetragon/pull/3725
  • https://github.com/cilium/tetragon/pull/3730

kkourt avatar Jul 30 '25 15:07 kkourt