Data filter in kernel
1. Explain what the PR does
e49145383 Tracee kernel data filter test 9aa597298 Tracee data filter equalities ddb456d51 eBPF data filter (user-space) 606e25148 Enable data filter in eBPF program a5d4e619c eBPF data filter (kernel-space) f2928bffd Enable BPF_F_NO_PREALLOC for LPM TRIE
e49145383 Tracee kernel data filter test
- Add MatchTypes{} in cmp.AllowUnexported
9aa597298 Tracee data filter equalities
- method equalities created for data filter;
- handle corner case when one policy uses a substring (path) of another
policy;
- disable data filter (only pathname) for selected events.
ddb456d51 eBPF data filter (user-space)
- eBPF map definition for exactly, prefix, suffix match;
- create updateDataFilterLPMBPF and updateDataFilterBPF to populate eBPF
maps;
- config map fields for exactly, prefix and suffix.
606e25148 Enable data filter in eBPF program
- how to enable data filter in the eBPF program using the function
evaluate_data_filters.
a5d4e619c eBPF data filter (kernel-space)
- function load_str_from_buf created to retrieve str value based on index;
- function reverse_string created to revert the pathname in order to enable suffix;
- function evaluate_data_filters/match_data_filters created to apply: exactly, prefix and suffix match;
- eBPF maps for exactly, prefix and suffix. eBPF map for hold temporary LPM TRI key;
- add fields in config_map for exactly, prefix and suffix match;
- save offset at the specified index in the function save_str_to_buf.
2. Explain how to test it
The method for defining data filters in Tracee remains the same. However, for the security_file_open and magic_write events, if the pathname is used as a filter, the event is now filtered at the eBPF data plane, preventing it from being sent to user space for filtering.
Notes for the reviewer: The following sections contain commands I used to test with policies. The results for each test group are also included. Both the policies and results are located in the zip file provided in each section.
- Only exactly match
- Only prefix match
- Only suffix match
- Mixed (exactly/prefix/suffix) match
- Ensuring Multiple Policy Matches when LPM Trie is used - Prefix
- Ensuring Multiple Policy Matches when LPM Trie is used - Suffix
Only exactly match
Tracee
sudo ./dist/tracee -p examples/policies/sfo-exactly-1.yaml -p examples/policies/sfo-exactly-2.yaml -p examples/policies/sfo-exactly-3.yaml -o json
Maps
% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 7,
"prefix_enabled_data_filters": 0,
"suffix_enabled_data_filters": 0,
"exactly_out_data_filters": 4,
"prefix_out_data_filters": 0,
"suffix_out_data_filters": 0,
"enabled_data_filters": 7,
...
## dump data_filter_exactly
% sudo bpftool map dump id 24016
[{
"key": {
"event_id": 732,
"path": "/etc/networks"
},
"value": {
"equal_in_scopes": 2,
"equality_set_in_scopes": 6
}
},{
"key": {
"event_id": 732,
"path": "/etc/netconfig"
},
"value": {
"equal_in_scopes": 1,
"equality_set_in_scopes": 1
}
}
]
Cmds
The results of each of the following lines are in the JSON file (results_exactly.json):
% more /etc/netconfig # json line 1 (sfo-exactly-match-1) % more /etc/networks # json line 2 (sfo-exactly-match-2) % cat /etc/networks # json line 3,4,5 (sfo-exactly-match-3)
Only prefix match
Tracee
sudo ./dist/tracee -p examples/policies/sfo-prefix-1.yaml -p examples/policies/sfo-prefix-2.yaml -p examples/policies/sfo-prefix-3.yaml -o json
Maps
% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 0,
"prefix_enabled_data_filters": 7,
"suffix_enabled_data_filters": 0,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 4,
"suffix_out_data_filters": 0,
"enabled_data_filters": 7,
...
## dump data_filter_prefix
% sudo bpftool map dump id 24583
[{
"key": {
"prefix_len": 128,
"event_id": 732,
"path": "/etc/network"
},
"value": {
"equal_in_scopes": 1,
"equality_set_in_scopes": 5
}
},{
"key": {
"prefix_len": 104,
"event_id": 732,
"path": "/etc/pass"
},
"value": {
"equal_in_scopes": 2,
"equality_set_in_scopes": 2
}
}
]
Cmds
The results of each of the following lines are in the JSON file (results_prefix.json):
% more /etc/networks # json line 1 (sfo-prefix-match-1) % sudo cp /etc/networks /etc/networks.bkp; more /etc/networks.bkp # json line 2 (sfo-prefix-match-1) % more /etc/passwd # json line 3 (sfo-prefix-match-2) % cat /etc/networks # json line 4,5,6 (sfo-prefix-match-3)
Only suffix match
Tracee
sudo ./dist/tracee -p examples/policies/sfo-suffix-1.yaml -p examples/policies/sfo-suffix-2.yaml -p examples/policies/sfo-suffix-3.yaml -o json
Maps
% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 0,
"prefix_enabled_data_filters": 0,
"suffix_enabled_data_filters": 7,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 0,
"suffix_out_data_filters": 4,
"enabled_data_filters": 7,
...
## dump data_filter_suffix
% sudo bpftool map dump id 24583
[{
"key": {
"prefix_len": 80,
"event_id": 732,
"path": "dwssap"
},
"value": {
"equal_in_scopes": 2,
"equality_set_in_scopes": 2
}
},{
"key": {
"prefix_len": 104,
"event_id": 732,
"path": "gifnocten"
},
"value": {
"equal_in_scopes": 1,
"equality_set_in_scopes": 5
}
}
]
Cmds
The results of each of the following lines are in the JSON file (results_suffix.json):
% more /etc/netconfig # json line 1 (sfo-suffix-match-1) % cp /etc/netconfig /tmp/netconfig; more /tmp/netconfig # json line 2 (sfo-suffix-match-1) % more /etc/passwd # json line 3 (sfo-suffix-match-2) % cat /etc/netconfig # json line 4,5,6 (sfo-suffix-match-3)
Mixed (exactly/prefix/suffix) match
In this section, you can see all string matches working together. The command cat /etc/netconfig triggers three policies simultaneously, while the command cat /etc/host.conf triggers two policies.
Tracee
sudo ./dist/tracee -p examples/policies/sfo-exactly-5.yaml -p examples/policies/sfo-prefix-4.yaml -p examples/policies/sfo-suffix-4.yaml -p examples/policies/sfo-suffix-5.yaml -p examples/policies/sfo-prefix-5.yaml -o json
Maps
% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 1,
"prefix_enabled_data_filters": 18,
"suffix_enabled_data_filters": 12,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 0,
"suffix_out_data_filters": 0,
"enabled_data_filters": 31,
...
## dump exactly
[{
"key": {
"event_id": 732,
"path": "/etc/netconfig"
},
"value": {
"equal_in_scopes": 1,
"equality_set_in_scopes": 1
}
}
]
## dump prefix
[{
"key": {
"prefix_len": 104,
"event_id": 732,
"path": "/etc/host"
},
"value": {
"equal_in_scopes": 16,
"equality_set_in_scopes": 16
}
},{
"key": {
"prefix_len": 96,
"event_id": 732,
"path": "/etc/net"
},
"value": {
"equal_in_scopes": 2,
"equality_set_in_scopes": 2
}
}
]
## dump suffix
[{
"key": {
"prefix_len": 72,
"event_id": 732,
"path": "fnoc."
},
"value": {
"equal_in_scopes": 8,
"equality_set_in_scopes": 8
}
},{
"key": {
"prefix_len": 104,
"event_id": 732,
"path": "gifnocten"
},
"value": {
"equal_in_scopes": 4,
"equality_set_in_scopes": 4
}
}
]
Cmds
The results of each of the following lines are in the JSON file (results_mixed.json):
% cat /etc/network/fan # json line 1 (sfo-prefix-match-4) % cp /etc/netconfig /tmp/netconfig; cat /tmp/netconfig # json line 2 (sfo-suffix-match-4) % cat /etc/netconfig # json line 3 (sfo-exactly-match-5,sfo-prefix-match-4,sfo-suffix-match-4) % cat /etc/host.conf # json line 4 (sfo-suffix-match-5,sfo-prefix-match-5) % cat /etc/sysctl.conf # json line 5 (sfo-suffix-match-5)
Ensuring Multiple Policy Matches when LPM Trie is used - Prefix
Corner case description: When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2.
A potential solution (implemented): If one suffix or prefix overlaps with another, we can simply combine their bitmaps in user space. No additional logic is required in kernel space to handle this corner case.
Tracee
sudo ./dist/tracee -p examples/policies/cc-sfo-prefix-1.yaml -p examples/policies/cc-sfo-prefix-2.yaml -p examples/policies/cc-sfo-prefix-3.yaml -p examples/policies/cc-sfo-prefix-4.yaml -o json
Maps
% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 0,
"prefix_enabled_data_filters": 15,
"suffix_enabled_data_filters": 0,
"exactly_out_data_filters": 0,
"prefix_out_data_filters": 8,
"suffix_out_data_filters": 0,
"enabled_data_filters": 15,
...
## dump prefix
[{
"key": {
"prefix_len": 128,
"event_id": 732,
"path": "/etc/netconf"
},
"value": {
"equal_in_scopes": 3,
"equality_set_in_scopes": 11
}
},{
"key": {
"prefix_len": 128,
"event_id": 732,
"path": "/etc/network"
},
"value": {
"equal_in_scopes": 7,
"equality_set_in_scopes": 7
}
},{
"key": {
"prefix_len": 96,
"event_id": 732,
"path": "/etc/net"
},
"value": {
"equal_in_scopes": 3,
"equality_set_in_scopes": 3
}
},{
"key": {
"prefix_len": 80,
"event_id": 732,
"path": "/etc/n"
},
"value": {
"equal_in_scopes": 1,
"equality_set_in_scopes": 1
}
},{
"key": {
"prefix_len": 104,
"event_id": 732,
"path": "/usr/lib/"
},
"value": {
"equal_in_scopes": 0,
"equality_set_in_scopes": 8
}
}
]
Note: Policy 4 includes lines to exclude library entries from the output. These lines are solely for cleaning up the output to simplify the testing in this section.
Policy 2 (prefix /etc/net) overlaps with Policy 1 (prefix /etc/n), as /etc/n is a substring of /etc/net. This is why the key with the path "/etc/net" has equality_set_in_scopes=3, indicating that both Policy 1 and Policy 2 are part of the same equality set. Additionally, equal_in_scopes=3 shows that Policy 1 and Policy 2 are considered equal in their scopes.
Policy 3 (prefix /etc/network) encompasses both Policy 1 (prefix /etc/n) and Policy 2 (prefix /etc/net). Consequently, the key with the path "/etc/network" has equality_set_in_scopes=7, which signifies that all three policies are present within the same scope. Similarly, equal_in_scopes=7 indicates that Policy 1, Policy 2, and Policy 3 are equal in scopes.
Policy 4 (prefix /etc/netconf) also includes both Policy 1 (prefix /etc/n) and Policy 2 (prefix /etc/net). Therefore, the key with the path "/etc/netconf" has equality_set_in_scopes=11, which means that Policy 1, Policy 2, and Policy 4 are all part of the same scope. However, because Policy 4 was defined with the condition data.pathname!=/etc/netconf, equal_in_scopes=3, meaning that only Policy 1 and Policy 2 are considered equal in scopes, while Policy 4 is excluded from that equality.
In summary, Policy 2, Policy 3, and Policy 4 derive bits from other policies, reflecting their interdependencies and overlaps in scope.
Cmds
The results of each of the following lines are in the JSON file (results_corner_case_prefix.json):
% more /etc/netconfig; json line 1 (cc-sfo-prefix-match-1; cc-sfo-prefix-match-2) % more /etc/networks; json line 2 (cc-sfo-prefix-match-2; cc-sfo-prefix-match-3; cc-sfo-prefix-match-4; cc-sfo-prefix-match-1) % sudo cp /etc/netconfig /etc/na; more /etc/na; json line 3 (cc-sfo-prefix-match-4; cc-sfo-prefix-match-1)
cc_prefix_policies_results.zip
Ensuring Multiple Policy Matches when LPM Trie is used - Suffix
Tracee
sudo ./dist/tracee -p examples/policies/cc-sfo-suffix-1.yaml -p examples/policies/cc-sfo-suffix-2.yaml -p examples/policies/cc-sfo-suffix-3.yaml -o json
Maps
% sudo bpftool map dump name config_map
...
"exactly_enabled_data_filters": 4,
"prefix_enabled_data_filters": 4,
"suffix_enabled_data_filters": 7,
"exactly_out_data_filters": 4,
"prefix_out_data_filters": 4,
"suffix_out_data_filters": 4,
"enabled_data_filters": 7,
...
## dump suffix
[{
"key": {
"prefix_len": 136,
"event_id": 732,
"path": "gifnocten/cte"
},
"value": {
"equal_in_scopes": 3,
"equality_set_in_scopes": 7
}
},{
"key": {
"prefix_len": 104,
"event_id": 732,
"path": "gifnocten"
},
"value": {
"equal_in_scopes": 3,
"equality_set_in_scopes": 3
}
},{
"key": {
"prefix_len": 80,
"event_id": 732,
"path": "gifnoc"
},
"value": {
"equal_in_scopes": 2,
"equality_set_in_scopes": 2
}
}
]
Note: Policy 3 includes lines to exclude library entries from the output. These lines are solely for cleaning up the output to simplify the testing in this section.
Policy 1 (suffix netconfig) overlaps with Policy 2 (suffix config), as config is a substring of netconfig. This is why the key with the path "gifnocten" (which is a reversed representation of netconfig) has equality_set_in_scopes=3 and this indicates that both Policy 1 and Policy 2 are contained within the same scope. However, equal_in_scopes=3 shows that Policy 1 and Policy 2 are equal in scopes.
Policy 2 (suffix config) only contains the bitmap of Policy 2 (equality_set_in_scopes=2 and equal_in_scopes=2).
Policy 3 (suffix etc/netconfig) contains both Policy 1 (suffix netconfig) and Policy 2 (suffix config). Therefore, the key with the path "gifnocten/cte" (representing the reverse of etc/netconfig) has equality_set_in_scopes=7. This value indicates that all three policies are present within the scope. However, equal_in_scopes=3 shows that only Policy 1 and Policy 2 are equal in scopes, whereas Policy 3 is disabled in this scope because it was defined using data.pathname!=etc/netconfig.
Cmds
The results of each of the following lines are in the JSON file (results_corner_case_suffix.json):
% more /etc/netconfig; json line 1 (cc-sfo-suffix-match-1; cc-sfo-suffix-match-2) % more /etc/ssh/ssh_config; json line 2 (cc-sfo-suffix-match-2; cc-sfo-suffix-match-3) % cp /etc/netconfig /tmp/netaconfig; more /tmp/netaconfig; json line 3 (cc-sfo-suffix-match-2; cc-sfo-suffix-match-3)
cc_suffix_policies_results.zip
3. Other comments
TODO
First Phase:
- [x] Evaluate and integrate all three types of string-based filters simultaneously. Currently, they operate independently and need to be combined.
- [x] Complete the implementation and testing of filters for exact matches, prefix, and suffix. Including both equal and not equal for each of one these three filters.
- [x] In the function
save_str_to_buf(), add the argument offset based on its index to facilitate direct access for theload_str_from_buf()function. - [x] Document the testing steps for exact match, prefix, and suffix filters. Include policies and expected results for the review process.
- [x] When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers
/etc/net*and policy2 covers/etc/netconf*, a lookup for/etc/netconfigcurrently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2. A potential solution (work in progress) is to combine equality when such a corner case is detected. - [x] Document the testing steps for corner case (prefix and suffix) filters. Include policies and expected results for the review process.
- [ ] Improve the method for defining (in user-space) which events should have an in-kernel filter enabled. The current logic was added as a proof of concept and requires rework.
- [ ] Measure performance between filter in user-space (older version) and filter in kernel-space (new version).
Second Phase:
- Currently, the index for retrieving the pathname in evaluate_data_filters is explicitly defined. While this works, it would be better to dynamically retrieve the index based on the event ID for greater flexibility.
@geyslan I've pushed some changes to how we retrieve the string from args in args_buffer_t. To make it work, I added a field to args_buffer_t and modified the save_str_to_buf function.
When using the LPM Trie, it always returns the longest matching string. A corner case arises when one policy (e.g., policy1) uses a substring of another policy (e.g., policy2). For instance, if policy1 covers /etc/net* and policy2 covers /etc/netconf*, a lookup for /etc/netconfig currently only returns policy2 because it is the longest match. However, it should return both policy1 and policy2. A potential solution (work in progress) is to combine equality when such a corner case is detected.
This is important, it could have been a security vulnerability. If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.
This is important, it could have been a security vulnerability. If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.
Thank you for commenting on this, @itaysk. If filtering in the kernel isn't possible, I'll definitely try this solution.
This is important, it could have been a security vulnerability. If you can't filter it correctly in kernel, perhaps an acceptable solution is to filter it again in userspace. We would would still gain the performance improvement from reducing events submission attemps in the bpf program, and add a second pass in userspace to ensure correctness.
Thank you for commenting on this, @itaysk. If filtering in the kernel isn't possible, I'll definitely try this solution.
There is no need to filter in userspace for such a corner case. It is possible to define the result of the longest prefix match to contain the results of the policies with the sub strings as well, since the return value of the map is matched policies (or matched rules in the future)
@yanivagman @geyslan Thank you for all the reviews. I've just pushed some updates based on the discussions in this PR:
- Create
createNewDataFilterMapsVersionin order to create the inner maps based on version and event id. This use the new approach as discussed with @yanivagman ;createNewDataFilterMapsVersionis only used for data filters - in future it should have a common way to create anyversion+version id(used by match rules);
- Because event id was removed from key struct, now its possible to use a 255 characters for exact, prefix and suffix;
- Combine the bitmaps (using OR operation) when multiples filters are used in the same policy (
match_data_filtersfunction); - Some fields from some
event idare filtered in kernel. For those scenarios, the definition of such filter in CLI/policy acts with two restrictions: 1) max 255 for pathname and 2) don't allow the option contains (e.g: security_file_open.data.pathname=*net*); - Add unit tests and integration tests.
/fast-forward
Folks @yanivagman @geyslan,
Thank you for all the feedbacks! Learned a lot of good things during this work!
Congrats for this amazing new feature! 🚀🥳