tracee icon indicating copy to clipboard operation
tracee copied to clipboard

Data filter in kernel

Open rscampos opened this issue 1 year ago • 4 comments

1. Explain what the PR does

e49145383 Tracee kernel data filter test 9aa597298 Tracee data filter equalities ddb456d51 eBPF data filter (user-space) 606e25148 Enable data filter in eBPF program a5d4e619c eBPF data filter (kernel-space) f2928bffd Enable BPF_F_NO_PREALLOC for LPM TRIE

e49145383 Tracee kernel data filter test

- Add MatchTypes{} in cmp.AllowUnexported

9aa597298 Tracee data filter equalities

- method equalities created for data filter;
- handle corner case when one policy uses a substring (path) of another
  policy;
- disable data filter (only pathname) for selected events.

ddb456d51 eBPF data filter (user-space)

- eBPF map definition for exactly, prefix, suffix match;
- create updateDataFilterLPMBPF and updateDataFilterBPF to populate eBPF
  maps;
- config map fields for exactly, prefix and suffix.

606e25148 Enable data filter in eBPF program

- how to enable data filter in the eBPF program using the function
evaluate_data_filters.

a5d4e619c eBPF data filter (kernel-space)

- function load_str_from_buf created to retrieve str value based on index;
- function reverse_string created to revert the pathname in order to enable suffix;
- function evaluate_data_filters/match_data_filters created to apply: exactly, prefix and suffix match;
- eBPF maps for exactly, prefix and suffix. eBPF map for hold temporary LPM TRI key;
- add fields in config_map for exactly, prefix and suffix match;
- save offset at the specified index in the function save_str_to_buf.

2. Explain how to test it

The method for defining data filters in Tracee remains the same. However, for the security_file_open and magic_write events, if the pathname is used as a filter, the event is now filtered at the eBPF data plane, preventing it from being sent to user space for filtering.

Notes for the reviewer: The following sections contain commands I used to test with policies. The results for each test group are also included. Both the policies and results are located in the zip file provided in each section.

  • Only exactly match
  • Only prefix match
  • Only suffix match
  • Mixed (exactly/prefix/suffix) match
  • Ensuring Multiple Policy Matches when LPM Trie is used - Prefix
  • Ensuring Multiple Policy Matches when LPM Trie is used - Suffix

Only exactly match

Tracee

sudo ./dist/tracee -p examples/policies/sfo-exactly-1.yaml -p examples/policies/sfo-exactly-2.yaml -p examples/policies/sfo-exactly-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 7,
"prefix_enabled_data_filters": 0,
"suffix_enabled_data_filters": 0,
"exactly_out_data_filters": 4,
"prefix_out_data_filters": 0,
"suffix_out_data_filters": 0,
"enabled_data_filters": 7,
...
## dump data_filter_exactly
% sudo bpftool map dump id 24016
[{
        "key": {
            "event_id": 732,
            "path": "/etc/networks"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 6
        }
    },{
        "key": {
            "event_id": 732,
            "path": "/etc/netconfig"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 1
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_exactly.json):

% more /etc/netconfig # json line 1 (sfo-exactly-match-1) % more /etc/networks # json line 2 (sfo-exactly-match-2) % cat /etc/networks # json line 3,4,5 (sfo-exactly-match-3)

exactly_policies_results.zip

Only prefix match

Tracee

sudo ./dist/tracee -p examples/policies/sfo-prefix-1.yaml -p examples/policies/sfo-prefix-2.yaml -p examples/policies/sfo-prefix-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 0,
"prefix_enabled_data_filters": 7,
"suffix_enabled_data_filters": 0,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 4,
"suffix_out_data_filters": 0,
"enabled_data_filters": 7,
...
## dump data_filter_prefix
% sudo bpftool map dump id 24583
[{
        "key": {
            "prefix_len": 128,
            "event_id": 732,
            "path": "/etc/network"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 5
        }
    },{
        "key": {
            "prefix_len": 104,
            "event_id": 732,
            "path": "/etc/pass"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_prefix.json):

% more /etc/networks # json line 1 (sfo-prefix-match-1) % sudo cp /etc/networks /etc/networks.bkp; more /etc/networks.bkp # json line 2 (sfo-prefix-match-1) % more /etc/passwd # json line 3 (sfo-prefix-match-2) % cat /etc/networks # json line 4,5,6 (sfo-prefix-match-3)

prefix_policies_results.zip

Only suffix match

Tracee

sudo ./dist/tracee -p examples/policies/sfo-suffix-1.yaml -p examples/policies/sfo-suffix-2.yaml -p examples/policies/sfo-suffix-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 0,
"prefix_enabled_data_filters": 0,
"suffix_enabled_data_filters": 7,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 0,
"suffix_out_data_filters": 4,
"enabled_data_filters": 7,
...
## dump data_filter_suffix
% sudo bpftool map dump id 24583
[{
        "key": {
            "prefix_len": 80,
            "event_id": 732,
            "path": "dwssap"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    },{
        "key": {
            "prefix_len": 104,
            "event_id": 732,
            "path": "gifnocten"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 5
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_suffix.json):

% more /etc/netconfig # json line 1 (sfo-suffix-match-1) % cp /etc/netconfig /tmp/netconfig; more /tmp/netconfig # json line 2 (sfo-suffix-match-1) % more /etc/passwd # json line 3 (sfo-suffix-match-2) % cat /etc/netconfig # json line 4,5,6 (sfo-suffix-match-3)

suffix_policies_results.zip

Mixed (exactly/prefix/suffix) match

In this section, you can see all string matches working together. The command cat /etc/netconfig triggers three policies simultaneously, while the command cat /etc/host.conf triggers two policies.

Tracee

sudo ./dist/tracee -p examples/policies/sfo-exactly-5.yaml -p examples/policies/sfo-prefix-4.yaml -p examples/policies/sfo-suffix-4.yaml -p examples/policies/sfo-suffix-5.yaml -p examples/policies/sfo-prefix-5.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 1,
"prefix_enabled_data_filters": 18,
"suffix_enabled_data_filters": 12,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 0,
"suffix_out_data_filters": 0,
"enabled_data_filters": 31,
...
## dump exactly
[{
        "key": {
            "event_id": 732,
            "path": "/etc/netconfig"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 1
        }
    }
]

## dump prefix
[{
        "key": {
            "prefix_len": 104,
            "event_id": 732,
            "path": "/etc/host"
        },
        "value": {
            "equal_in_scopes": 16,
            "equality_set_in_scopes": 16
        }
    },{
        "key": {
            "prefix_len": 96,
            "event_id": 732,
            "path": "/etc/net"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    }
]

## dump suffix
[{
        "key": {
            "prefix_len": 72,
            "event_id": 732,
            "path": "fnoc."
        },
        "value": {
            "equal_in_scopes": 8,
            "equality_set_in_scopes": 8
        }
    },{
        "key": {
            "prefix_len": 104,
            "event_id": 732,
            "path": "gifnocten"
        },
        "value": {
            "equal_in_scopes": 4,
            "equality_set_in_scopes": 4
        }
    }
]

Cmds

The results of each of the following lines are in the JSON file (results_mixed.json):

% cat /etc/network/fan # json line 1 (sfo-prefix-match-4) % cp /etc/netconfig /tmp/netconfig; cat /tmp/netconfig # json line 2 (sfo-suffix-match-4) % cat /etc/netconfig # json line 3 (sfo-exactly-match-5,sfo-prefix-match-4,sfo-suffix-match-4) % cat /etc/host.conf # json line 4 (sfo-suffix-match-5,sfo-prefix-match-5) % cat /etc/sysctl.conf # json line 5 (sfo-suffix-match-5)

mixed_policies_results.zip

Ensuring Multiple Policy Matches when LPM Trie is used - Prefix

Corner case description: When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2.

A potential solution (implemented): If one suffix or prefix overlaps with another, we can simply combine their bitmaps in user space. No additional logic is required in kernel space to handle this corner case.

Tracee

sudo ./dist/tracee -p examples/policies/cc-sfo-prefix-1.yaml -p examples/policies/cc-sfo-prefix-2.yaml -p examples/policies/cc-sfo-prefix-3.yaml -p examples/policies/cc-sfo-prefix-4.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 0,
"prefix_enabled_data_filters": 15,
"suffix_enabled_data_filters": 0,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 8,
"suffix_out_data_filters": 0,
"enabled_data_filters": 15,
...
## dump prefix
[{
        "key": {
            "prefix_len": 128,
            "event_id": 732,
            "path": "/etc/netconf"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 11
        }
    },{
        "key": {
            "prefix_len": 128,
            "event_id": 732,
            "path": "/etc/network"
        },
        "value": {
            "equal_in_scopes": 7,
            "equality_set_in_scopes": 7
        }
    },{
        "key": {
            "prefix_len": 96,
            "event_id": 732,
            "path": "/etc/net"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 3
        }
    },{
        "key": {
            "prefix_len": 80,
            "event_id": 732,
            "path": "/etc/n"
        },
        "value": {
            "equal_in_scopes": 1,
            "equality_set_in_scopes": 1
        }
    },{
        "key": {
            "prefix_len": 104,
            "event_id": 732,
            "path": "/usr/lib/"
        },
        "value": {
            "equal_in_scopes": 0,
            "equality_set_in_scopes": 8
        }
    }
]

Note: Policy 4 includes lines to exclude library entries from the output. These lines are solely for cleaning up the output to simplify the testing in this section.

Policy 2 (prefix /etc/net) overlaps with Policy 1 (prefix /etc/n), as /etc/n is a substring of /etc/net. This is why the key with the path "/etc/net" has equality_set_in_scopes=3, indicating that both Policy 1 and Policy 2 are part of the same equality set. Additionally, equal_in_scopes=3 shows that Policy 1 and Policy 2 are considered equal in their scopes.

Policy 3 (prefix /etc/network) encompasses both Policy 1 (prefix /etc/n) and Policy 2 (prefix /etc/net). Consequently, the key with the path "/etc/network" has equality_set_in_scopes=7, which signifies that all three policies are present within the same scope. Similarly, equal_in_scopes=7 indicates that Policy 1, Policy 2, and Policy 3 are equal in scopes.

Policy 4 (prefix /etc/netconf) also includes both Policy 1 (prefix /etc/n) and Policy 2 (prefix /etc/net). Therefore, the key with the path "/etc/netconf" has equality_set_in_scopes=11, which means that Policy 1, Policy 2, and Policy 4 are all part of the same scope. However, because Policy 4 was defined with the condition data.pathname!=/etc/netconf, equal_in_scopes=3, meaning that only Policy 1 and Policy 2 are considered equal in scopes, while Policy 4 is excluded from that equality.

In summary, Policy 2, Policy 3, and Policy 4 derive bits from other policies, reflecting their interdependencies and overlaps in scope.

Cmds

The results of each of the following lines are in the JSON file (results_corner_case_prefix.json):

% more /etc/netconfig; json line 1 (cc-sfo-prefix-match-1; cc-sfo-prefix-match-2) % more /etc/networks; json line 2 (cc-sfo-prefix-match-2; cc-sfo-prefix-match-3; cc-sfo-prefix-match-4; cc-sfo-prefix-match-1) % sudo cp /etc/netconfig /etc/na; more /etc/na; json line 3 (cc-sfo-prefix-match-4; cc-sfo-prefix-match-1)

cc_prefix_policies_results.zip

Ensuring Multiple Policy Matches when LPM Trie is used - Suffix

Tracee

sudo ./dist/tracee -p examples/policies/cc-sfo-suffix-1.yaml -p examples/policies/cc-sfo-suffix-2.yaml -p examples/policies/cc-sfo-suffix-3.yaml -o json

Maps

% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 4,
"prefix_enabled_data_filters": 4,
"suffix_enabled_data_filters": 7,
"exactly_out_data_filters": 4,
"prefix_out_data_filters": 4,
"suffix_out_data_filters": 4,
"enabled_data_filters": 7,
...
## dump suffix
[{
        "key": {
            "prefix_len": 136,
            "event_id": 732,
            "path": "gifnocten/cte"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 7
        }
    },{
        "key": {
            "prefix_len": 104,
            "event_id": 732,
            "path": "gifnocten"
        },
        "value": {
            "equal_in_scopes": 3,
            "equality_set_in_scopes": 3
        }
    },{
        "key": {
            "prefix_len": 80,
            "event_id": 732,
            "path": "gifnoc"
        },
        "value": {
            "equal_in_scopes": 2,
            "equality_set_in_scopes": 2
        }
    }
]

Note: Policy 3 includes lines to exclude library entries from the output. These lines are solely for cleaning up the output to simplify the testing in this section.

Policy 1 (suffix netconfig) overlaps with Policy 2 (suffix config), as config is a substring of netconfig. This is why the key with the path "gifnocten" (which is a reversed representation of netconfig) has equality_set_in_scopes=3 and this indicates that both Policy 1 and Policy 2 are contained within the same scope. However, equal_in_scopes=3 shows that Policy 1 and Policy 2 are equal in scopes.

Policy 2 (suffix config) only contains the bitmap of Policy 2 (equality_set_in_scopes=2 and equal_in_scopes=2).

Policy 3 (suffix etc/netconfig) contains both Policy 1 (suffix netconfig) and Policy 2 (suffix config). Therefore, the key with the path "gifnocten/cte" (representing the reverse of etc/netconfig) has equality_set_in_scopes=7. This value indicates that all three policies are present within the scope. However, equal_in_scopes=3 shows that only Policy 1 and Policy 2 are equal in scopes, whereas Policy 3 is disabled in this scope because it was defined using data.pathname!=etc/netconfig.

Cmds

The results of each of the following lines are in the JSON file (results_corner_case_suffix.json):

% more /etc/netconfig; json line 1 (cc-sfo-suffix-match-1; cc-sfo-suffix-match-2) % more /etc/ssh/ssh_config; json line 2 (cc-sfo-suffix-match-2; cc-sfo-suffix-match-3) % cp /etc/netconfig /tmp/netaconfig; more /tmp/netaconfig; json line 3 (cc-sfo-suffix-match-2; cc-sfo-suffix-match-3)

cc_suffix_policies_results.zip

3. Other comments

TODO

First Phase:

  • [x] Evaluate and integrate all three types of string-based filters simultaneously. Currently, they operate independently and need to be combined.
  • [x] Complete the implementation and testing of filters for exact matches, prefix, and suffix. Including both equal and not equal for each of one these three filters.
  • [x] In the function save_str_to_buf(), add the argument offset based on its index to facilitate direct access for the load_str_from_buf() function.
  • [x] Document the testing steps for exact match, prefix, and suffix filters. Include policies and expected results for the review process.
  • [x] When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2. A potential solution (work in progress) is to combine equality when such a corner case is detected.
  • [x] Document the testing steps for corner case (prefix and suffix) filters. Include policies and expected results for the review process.
  • [ ] Improve the method for defining (in user-space) which events should have an in-kernel filter enabled. The current logic was added as a proof of concept and requires rework.
  • [ ] Measure performance between filter in user-space (older version) and filter in kernel-space (new version).

Second Phase:

  • Currently, the index for retrieving the pathname in evaluate_data_filters is explicitly defined. While this works, it would be better to dynamically retrieve the index based on the event ID for greater flexibility.

rscampos avatar Sep 24 '24 02:09 rscampos

@geyslan I've pushed some changes to how we retrieve the string from args in args_buffer_t. To make it work, I added a field to args_buffer_t and modified the save_str_to_buf function.

rscampos avatar Oct 07 '24 21:10 rscampos

When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2. A potential solution (work in progress) is to combine equality when such a corner case is detected.

This is important, it could have been a security vulnerability. If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.

itaysk avatar Oct 12 '24 08:10 itaysk

This is important, it could have been a security vulnerability. If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.

Thank you for commenting on this, @itaysk. If filtering in the kernel isn't possible, I'll definitely try this solution.

rscampos avatar Oct 16 '24 18:10 rscampos

This is important, it could have been a security vulnerability. If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.

Thank you for commenting on this, @itaysk. If filtering in the kernel isn't possible, I'll definitely try this solution.

There is no need to filter in userspace for such a corner case. It is possible to define the result of the longest prefix match to contain the results of the policies with the sub strings as well, since the return value of the map is matched policies (or matched rules in the future)

yanivagman avatar Oct 18 '24 17:10 yanivagman

@yanivagman @geyslan Thank you for all the reviews. I've just pushed some updates based on the discussions in this PR:

  • Create createNewDataFilterMapsVersion in order to create the inner maps based on version and event id. This use the new approach as discussed with @yanivagman ;
    • createNewDataFilterMapsVersion is only used for data filters - in future it should have a common way to create any version + version id (used by match rules);
  • Because event id was removed from key struct, now its possible to use a 255 characters for exact, prefix and suffix;
  • Combine the bitmaps (using OR operation) when multiples filters are used in the same policy (match_data_filters function);
  • Some fields from some event id are filtered in kernel. For those scenarios, the definition of such filter in CLI/policy acts with two restrictions: 1) max 255 for pathname and 2) don't allow the option contains (e.g: security_file_open.data.pathname=*net*);
  • Add unit tests and integration tests.

rscampos avatar Nov 29 '24 22:11 rscampos

/fast-forward

rscampos avatar Dec 13 '24 20:12 rscampos

Folks @yanivagman @geyslan,

Thank you for all the feedbacks! Learned a lot of good things during this work!

rscampos avatar Dec 13 '24 20:12 rscampos

Congrats for this amazing new feature! 🚀🥳

geyslan avatar Dec 13 '24 23:12 geyslan