tetragon
tetragon copied to clipboard
WIP: IMA hashes in LSM events
Adding support for collecting file hashes calculated by Linux integrity subsystem (IMA). IMA hashes will be contained in LSM events. To make it work you need to:
There are limitations on implementation due to the requirement to use bpf_ima_file_hash/bpf_ima_inode_hash helpers. The more information you can get from #2409.
Some design insights:
- File hash collection will be triggered by Post Action
- Only limited set of lsm hooks is supported: (mmap_file, bprm_check_security, file_open)
- Hash collection is started after LSM even is posted (tail call from generic_output). Hashes are stored in separate bpf_map. Event is enriched in user spaces the way stacktrace extraction works.
TODO list:
- [x] modification of bpf_generic_lsm and do_actions
- [x] CRD and lsm sensor updates
- [ ] tests
- [ ] docs
Deploy Preview for tetragon ready!
| Name | Link |
|---|---|
| Latest commit | d5c5ec2e9e4d12ec8839610aa1b85cde95cd39a8 |
| Latest deploy log | https://app.netlify.com/sites/tetragon/deploys/66feb22fa7347900086e433f |
| Deploy Preview | https://deploy-preview-2818--tetragon.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
@kkourt , hi! May I ask you for a code review? Next Monday, on Tetragon community meeting I'll show a demo. I want to talk about some design decisions I made. Also I'm going to speak about problems that I overcame and those are still exists.
@kkourt , hi! May I ask you for a code review? Next Monday, on Tetragon community meeting I'll show a demo. I want to talk about some design decisions I made. Also I'm going to speak about problems that I overcame and those are still exists.
Hello :)
Thanks for the heads up! Will try and have a look before the community meeting. Looking forward to the demo!
I refreshed on LSM programs and I think we can do this differently
so LSM programs are attached through trampolines on specific bpf_lsm_XXX function which is invoked as part of the lsm modules call chain on particular lsm call back
and trampoline allows to attach and invoke multiple bpf programs to specific function (or lsm hook)
so I think we could have the sleepable lsm hooks invoked first to get the hash and update the map and then we will have generic lsm invoked that would run the hash action, which would get the hash from the map and send the event
bpf_lsm_XXX
-> trampoline
run lsm.s/file_open
map_update_elem(&ima_hash_map, &e->current, &hash, BPF_ANY);
run lsm/generic_lsm
map_lookup_elem(&ima_hash_map, &e->current, ...)
to ensure the generic lsm is invoked AFTER lsm.s/XXX hook we can rely on kernel using standard lists when queuing programs for trampoline, so you should just attach lsm.s hook first and then generic lsm
I guess there might be some hiccups when you'll implement that (there always are) but so far it seems straightforward
WDYT?
I refreshed on LSM programs and I think we can do this differently
so LSM programs are attached through trampolines on specific bpf_lsm_XXX function which is invoked as part of the lsm modules call chain on particular lsm call back
and trampoline allows to attach and invoke multiple bpf programs to specific function (or lsm hook)
so I think we could have the sleepable lsm hooks invoked first to get the hash and update the map and then we will have generic lsm invoked that would run the hash action, which would get the hash from the map and send the event
bpf_lsm_XXX -> trampoline run lsm.s/file_open map_update_elem(&ima_hash_map, &e->current, &hash, BPF_ANY); run lsm/generic_lsm map_lookup_elem(&ima_hash_map, &e->current, ...)to ensure the generic lsm is invoked AFTER lsm.s/XXX hook we can rely on kernel using standard lists when queuing programs for trampoline, so you should just attach lsm.s hook first and then generic lsm
I guess there might be some hiccups when you'll implement that (there always are) but so far it seems straightforward
WDYT?
Thanks for making it clear!
IIUC, we will calculate hash on every triggered hook. It might cause some performance issues. Now our calls of bpf_ima_file_hash are filtered by tracing policy. When we move lsm.s/ima_* programe to the top, we will call helper no matter if tracing policy is satisfied or not. I think, it will be OK, if ima_policy is satisfied. Let's consider the following example:
We have a default ima_policy=tcb:
measure func=MMAP_CHECK mask=MAY_EXEC
measure func=BPRM_CHECK mask=MAY_EXEC # binary executed
measure func=FILE_CHECK mask=^MAY_READ euid=0
measure func=FILE_CHECK mask=^MAY_READ uid=0 # root opened r/o, r/w
measure func=MODULE_CHECK
measure func=FIRMWARE_CHECK
measure func=POLICY_CHECK
In particular, this ima_policy will measure hashes when files owned by root are opened for reading. So, in solution that you proposed will call bpf_ima_file_hash on every hook. The question is what helper will return, if ima_policy is not satisfied? If error, than it is OK, I think. But if it calculates hash despite the ima_policy it is not OK. I think, I can check this.
To sum up, I think, you solution is a nice compromise if ima_policy is taking to account during bpf_ima_file_hash call. What do you think about this?
When we move lsm.s/ima_* programe to the top, we will call helper no matter if tracing policy is satisfied or not.
ah ok, I did not realized you might want to filter :-\
could we split the lsm sensor into filtering program, then run the program with ima helper and then run a program that sends it out? bit of a stretch but trampolines seem to disable cpu migration so we could use the percpu heap map to store half-done event and send it out in the last program
IIUC trampolines with LSM programs will run registered lsm programs until one of them returns != 0 .. but will need to check on that
also I still need to check all the options for lsm sleepable programs
When we move lsm.s/ima_* programe to the top, we will call helper no matter if tracing policy is satisfied or not. ah ok, I did not realized you might want to filter :-\
could we split the lsm sensor into filtering program, then run the program with ima helper and then run a program that sends it out? bit of a stretch but trampolines seem to disable cpu migration so we could use the percpu heap map to store half-done event and send it out in the last program
IIUC trampolines with LSM programs will run registered lsm programs until one of them returns != 0 .. but will need to check on that
also I still need to check all the options for lsm sleepable programs
Hmm, maybe we can split sensor into three parts. If we can use the same percpu heap map this scheme might work. It seems to me, that your first solution is much easier to implement:).
I tried to get hashes for files that are out of scope ima_policy=tcb. Helper returned --EOPNOTSUPP. I think, that exactly what we want! Hash calculation is blocked. I'm looking at the __ima_inode_hash to make sure that hash calculation is block due to policy violation, but didn't succeed yet.
I found out that bpf_ima_file_hash must get hash despite ima policy. There is another reason that I can't get hash for a file that doesn't satisfy ima policy. Maybe I'll find time to debug helper to know the reason. But to put ima helper to the top it is not good... Maybe we should split it three part lsm sensor.
Hmm, maybe we can split sensor into three parts. If we can use the same percpu heap map this scheme might work. It seems to me, that your first solution is much easier to implement:).
yea, it might be better to do the initial solution first as a base and split to 3 parts lsm sensor as a follow up.. we might learn other things that could shape up the final solution when doing that ;-)
Hmm, maybe we can split sensor into three parts. If we can use the same percpu heap map this scheme might work. It seems to me, that your first solution is much easier to implement:).
yea, it might be better to do the initial solution first as a base and split to 3 parts lsm sensor as a follow up.. we might learn other things that could shape up the final solution when doing that ;-)
I thought, I can implement some example of our idea using ebpf-go, before coding in tetragon. Here is the repo: https://github.com/anfedotoff/ima-ebpfgo-example, please have a look.
I have 3 bpf programs: three parts of LSM sensor. In the first program I emulate filtering and enforcing. Second program is used to get hash and the last one to send event via perf buffer. What I noticed.
- we need to save information is event filtered or not in test_msg struct (msg_generic_kprobe in case of tetragon). Don't forget about NoPost Action
- enforcing, filtering works fine.
- it seems to me, that send can be earlier the hash calculation. (race condition now resides here).
Also, I'm not sure, that it is enough just to attach programs this way to preserve the order when hook is triggered.
So far it seems that approach is not working for us, but I may be wrong.
So far it seems that approach is not working for us, but I may be wrong.
nice, but I'm not sure I understand what's wrong, could you please elaborate? I'll try it anyway
So far it seems that approach is not working for us, but I may be wrong.
nice, but I'm not sure I understand what's wrong, could you please elaborate? I'll try it anyway
Yes, sure, I'll try to explain. I attach three bpf programs using standard Atach interface for LSM programs in the following order:
- init_bprm_check (create event, store filename of executed binary in the map, do filtering + blocking);
- ima_bprm_check (lsm.s program, that calculates hashes and store it in the map);
- send_bprm_check (send stored event in map using perfbuffer);
The question is can we rely on the order of execution when bprm_check_security is triggered?
Here is some logs:
024/09/16 10:01:22 Waiting for events..
2024/09/16 10:01:44 Error file: ./tetragon. Code:0
2024/09/16 10:01:51 Error file: ./tetragon. Code:0
2024/09/16 10:01:57 Error file: ./tetragon. Code:0
2024/09/16 10:01:57 File: ./tetragon
2024/09/16 10:01:57 SHA256: 9c20c4a712cd95766b64bcddc1094b619092944b592be43d75c3f4b3fb1434d40000000000000000000000000000000000000000000000000000000000000000
2024/09/16 10:01:57 <=============>
2024/09/16 10:01:58 File: ./tetragon
2024/09/16 10:01:58 SHA256: 9c20c4a712cd95766b64bcddc1094b619092944b592be43d75c3f4b3fb1434d40000000000000000000000000000000000000000000000000000000000000000
2024/09/16 10:01:58 <=============>
2024/09/16 10:01:59 File: ./tetragon
2024/09/16 10:01:59 SHA256: 29aa689f38158d2e8941fa54e436f0260890af31cecad1e8799e5c2df7bc1ecc0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 10:01:59 <=============>
2024/09/16 10:02:20 Error file: ./tetragon. Code:0
2024/09/16 10:02:22 File: ./tetragon
2024/09/16 10:02:22 SHA256: 4f291296e89b784cd35479fca606f228126e3641f5bcaee68dee36583d7c94830000000000000000000000000000000000000000000000000000000000000000
2024/09/16 10:02:22 <=============>
^C2024/09/16 10:02:31 Received signal, exiting..
According this logs we see, that the 1st program always starts and finishes first. Some times the third program starts before second is finished. 2024/09/16 10:02:20 Error file: ./tetragon. Code:0 this line means that there is no hash in event. It seems to me this because sleepable program sleeps and another bpf program is being executed (this is a send program in our case).
The question is can we rely on the order of execution when bprm_check_security is triggered?
I think we can, it's hardcoded in kernel.. and if it changes we could actually detect the way it's added and use proper order attach
however it's the other way round (check https://pastebin.com/Y8EGQ0fg for the trampoline dump), so when I reverse attach in your example I get proper order of executed programs
could you plz try the change below?
thanks, jirka
diff --git a/main.go b/main.go
index 1d1fa138660d..0be26120a55f 100644
--- a/main.go
+++ b/main.go
@@ -40,11 +40,11 @@ func main() {
}
defer objs.Close()
- initBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.InitBprmCheck})
+ sendBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.SendBprmCheck})
if err != nil {
log.Fatalf("linking LSM failed: %s", err)
}
- defer initBprmHook.Close()
+ defer sendBprmHook.Close()
imaBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.ImaBprmCheck})
if err != nil {
@@ -52,11 +52,11 @@ func main() {
}
defer imaBprmHook.Close()
- sendBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.SendBprmCheck})
+ initBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.InitBprmCheck})
if err != nil {
log.Fatalf("linking LSM failed: %s", err)
}
- defer sendBprmHook.Close()
+ defer initBprmHook.Close()
rd, err := perf.NewReader(objs.TcpmonMap, 65535)
if err != nil {
The question is can we rely on the order of execution when bprm_check_security is triggered?
I think we can, it's hardcoded in kernel.. and if it changes we could actually detect the way it's added and use proper order attach
however it's the other way round (check https://pastebin.com/Y8EGQ0fg for the trampoline dump), so when I reverse attach in your example I get proper order of executed programs
could you plz try the change below?
thanks, jirka
diff --git a/main.go b/main.go index 1d1fa138660d..0be26120a55f 100644 --- a/main.go +++ b/main.go @@ -40,11 +40,11 @@ func main() { } defer objs.Close() - initBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.InitBprmCheck}) + sendBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.SendBprmCheck}) if err != nil { log.Fatalf("linking LSM failed: %s", err) } - defer initBprmHook.Close() + defer sendBprmHook.Close() imaBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.ImaBprmCheck}) if err != nil { @@ -52,11 +52,11 @@ func main() { } defer imaBprmHook.Close() - sendBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.SendBprmCheck}) + initBprmHook, err := link.AttachLSM(link.LSMOptions{Program: objs.imaPrograms.InitBprmCheck}) if err != nil { log.Fatalf("linking LSM failed: %s", err) } - defer sendBprmHook.Close() + defer initBprmHook.Close() rd, err := perf.NewReader(objs.TcpmonMap, 65535) if err != nil {
Hmm, I changed the order (swap init and send programs) as you suggest. I got no events collected, but execution is blocked. Are you sure that the order is really proper? It seems to me, that send program is now executed before init program. (send -> ima -> init). Anyway, the good news is that order is preserved. But other question is that sometimes we miss the hash in event. Does it mean that send program is started before ima program is finished?
also I think we need to use hash for the data shared by the program, it looks like cpu migration is not actually disabled, so we can't use per-cpu heap map in here
hum, when I remove the 'filter' from init function, I get all executed files output: (I'm still using the per-cpu heap, but I don't think we should rely on that)
root@amd:/home/jolsa/ima-ebpfgo-example# ./ima-test
2024/09/16 12:04:42 Waiting for events..
2024/09/16 12:04:43 File: /usr/bin/ls
2024/09/16 12:04:43 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:04:43 <=============>
2024/09/16 12:04:45 File: /usr/bin/ls
2024/09/16 12:04:45 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:04:45 <=============>
2024/09/16 12:04:47 File: /usr/bin/ls
2024/09/16 12:04:47 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:04:47 <=============>
2024/09/16 12:04:49 File: /usr/bin/ls
2024/09/16 12:04:49 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:04:49 <=============>
2024/09/16 12:04:50 File: /usr/bin/w
2024/09/16 12:04:50 SHA256: ae49096b42cebab2f3a2557e32b0f3e3e5139fd7e22201c9dc5bcfab9701946c0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:04:50 <=============>
2024/09/16 12:05:33 File: /usr/bin/git
2024/09/16 12:05:33 SHA256: 29aa689f38158d2e8941fa54e436f0260890af31cecad1e8799e5c2df7bc1ecc0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:05:33 <=============>
2024/09/16 12:05:33 File: /usr/bin/pager
2024/09/16 12:05:33 SHA256: f95851f468afe5c6e287d741153bde008ef26844916f5dfcb916e849ef3ae8bb0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:05:33 <=============>
2024/09/16 12:05:47 File: /usr/bin/vim
2024/09/16 12:05:47 SHA256: dfc4b48f9b7976f0b64f6e99702ad8b1e4fb1929958fe07da98dfeb88f0a23fb0000000000000000000000000000000000000000000000000000000000000000
2024/09/16 12:05:47 <=============>
hum, when I remove the 'filter' from init function, I get all executed files output: (I'm still using the per-cpu heap, but I don't think we should rely on that)
root@amd:/home/jolsa/ima-ebpfgo-example# ./ima-test 2024/09/16 12:04:42 Waiting for events.. 2024/09/16 12:04:43 File: /usr/bin/ls 2024/09/16 12:04:43 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:04:43 <=============> 2024/09/16 12:04:45 File: /usr/bin/ls 2024/09/16 12:04:45 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:04:45 <=============> 2024/09/16 12:04:47 File: /usr/bin/ls 2024/09/16 12:04:47 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:04:47 <=============> 2024/09/16 12:04:49 File: /usr/bin/ls 2024/09/16 12:04:49 SHA256: 12a6d908a68ccf6f9f3d799705577c28763f5deef6eddcff7643d6d8a6de543d0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:04:49 <=============> 2024/09/16 12:04:50 File: /usr/bin/w 2024/09/16 12:04:50 SHA256: ae49096b42cebab2f3a2557e32b0f3e3e5139fd7e22201c9dc5bcfab9701946c0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:04:50 <=============> 2024/09/16 12:05:33 File: /usr/bin/git 2024/09/16 12:05:33 SHA256: 29aa689f38158d2e8941fa54e436f0260890af31cecad1e8799e5c2df7bc1ecc0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:05:33 <=============> 2024/09/16 12:05:33 File: /usr/bin/pager 2024/09/16 12:05:33 SHA256: f95851f468afe5c6e287d741153bde008ef26844916f5dfcb916e849ef3ae8bb0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:05:33 <=============> 2024/09/16 12:05:47 File: /usr/bin/vim 2024/09/16 12:05:47 SHA256: dfc4b48f9b7976f0b64f6e99702ad8b1e4fb1929958fe07da98dfeb88f0a23fb0000000000000000000000000000000000000000000000000000000000000000 2024/09/16 12:05:47 <=============>
Yes, that works for me too, no hashes are missed. I remove filter with out changing order of execution, some hashes were missed.
Now I'm looking at __bpf_prog_enter_sleepable and __bpf_prog_enter
I think we need to use hash for the data shared by the program, it looks like cpu migration is not actually disabled, so we can't use per-cpu heap map in here
It looks so... To use hash map is a good idea, but for now I don't know how to share key for this map. If we can't use per-cpu heap.
It looks so... To use hash map is a good idea, but for now I don't know how to share key for this map. If we can't use per-cpu heap.
I think just bpf_get_current_pid_tgid() should be ok? it's always under same task (but maybe not same cpu)
It looks so... To use hash map is a good idea, but for now I don't know how to share key for this map. If we can't use per-cpu heap.
I think just bpf_get_current_pid_tgid() should be ok? it's always under same task (but maybe not same cpu)
This Is the first thing that came up to my mind. But what if we have file_open hook and the same process opens several files will it be OK? I think I can try to check this, nevertheless.
This Is the first thing that came up to my mind. But what if we have file_open hook and the same process opens several files will it be OK? I think I can try to check this, nevertheless.
hum, so if the key is thread id then single thread could have just one active file_open call and we should be fine, right?
This Is the first thing that came up to my mind. But what if we have file_open hook and the same process opens several files will it be OK? I think I can try to check this, nevertheless.
hum, so if the key is thread id then single thread could have just one active file_open call and we should be fine, right?
Ahh, yes, we have tid. You are right! It seems to be fine than. I'll try to fix my example. If every thing is fine, I'll start to refactor code in th PR. Thank you for helping me to find the solution!
This Is the first thing that came up to my mind. But what if we have file_open hook and the same process opens several files will it be OK? I think I can try to check this, nevertheless.
hum, so if the key is thread id then single thread could have just one active file_open call and we should be fine, right?
Ahh, yes, we have tid. You are right! It seems to be fine than. I'll try to fix my example. If every thing is fine, I'll start to refactor code in th PR. Thank you for helping me to find the solution!
Everything is fine! I can start refactoring :fireworks:
It works! No race condition! :sparkles:
Thanks, this looks good to me overall!
I have some requests for changes. Please let me know if they do not make sense to you. I have not followed the discussion with Jiri. Can you please add a summary of the discussion in the PR header. Having a discussion of what the previous approach was, why it did not work, and what this PR does that works would be very helpful!
Thanks!
Sure! I updated the PR description. I hope, now it became more clear about what we've done to avoid the race condition.
Thanks, this looks good to me overall! I have some requests for changes. Please let me know if they do not make sense to you. I have not followed the discussion with Jiri. Can you please add a summary of the discussion in the PR header. Having a discussion of what the previous approach was, why it did not work, and what this PR does that works would be very helpful! Thanks!
Sure! I updated the PR description. I hope, now it became more clear about what we've done to avoid the race condition.
Great Thanks! Can you please split the change that changes the return value into a separate commit with a short explanation of why it OK to do as discussed in another comment? Other than that, LGTM!
Thanks, this looks good to me overall! I have some requests for changes. Please let me know if they do not make sense to you. I have not followed the discussion with Jiri. Can you please add a summary of the discussion in the PR header. Having a discussion of what the previous approach was, why it did not work, and what this PR does that works would be very helpful! Thanks!
Sure! I updated the PR description. I hope, now it became more clear about what we've done to avoid the race condition.
Great Thanks! Can you please split the change that changes the return value into a separate commit with a short explanation of why it OK to do as discussed in another comment? Other than that, LGTM!
@kkourt, can you look one more time plz)?
All tests ✅, merging! Thanks!