Add args for `sendmmsg` events
Motivation
Currently, the sendmmsg syscall is supported, but lacks arguments in both exit and entry events. I believe that adding parameters could significantly enhance the visibility of outbound connection events.
Feature
Add params to the sendmmsg events
thank you for this! We will try to understand when to schedule this!
so basically you would like the sendmmsg to support an array of arguments where every argument is like what sendmsg does, right?
I can take it if no one has already started to add this support
Yes we would like to have something similar to sendmsg syscall
[PPME_SOCKET_SENDMSG_E] = {"sendmsg", EC_IO_WRITE | EC_SYSCALL, EF_USES_FD | EF_WRITES_TO_FD | EF_MODIFIES_STATE, 3, {{"fd", PT_FD, PF_DEC}, {"size", PT_UINT32, PF_DEC}, {"tuple", PT_SOCKTUPLE, PF_NA} } },
[PPME_SOCKET_SENDMSG_X] = {"sendmsg", EC_IO_WRITE | EC_SYSCALL, EF_USES_FD | EF_WRITES_TO_FD | EF_MODIFIES_STATE, 2, {{"res", PT_ERRNO, PF_DEC}, {"data", PT_BYTEBUF, PF_NA} } },
but since there is no need to have some info in the enter event and others in the exit one, I would go with implementing all the info in the exit event, so something like
[PPME_SOCKET_SENDMMSG_E] = {"sendmmsg", EC_IO_WRITE | EC_SYSCALL, EF_USES_FD, 0},
[PPME_SOCKET_SENDMMSG_X] = {"sendmmsg", EC_IO_WRITE | EC_SYSCALL, EF_USES_FD, 5, {{"res", PT_ERRNO, PF_DEC}, {"fd", PT_FD, PF_DEC}, {"data", PT_BYTEBUF, PF_NA}, {"size", PT_UINT32, PF_DEC}, {"tuple", PT_SOCKTUPLE, PF_NA} } },
I can take it if no one has already started to add this support
Sure! thank you!
@Andreagit97 regarding your post above, you think that there is no need to push any data to ringbuf in enter since the only change between the enter and exit is the fact that on exit the msg_len changes and this may be important to user and on exit all other data can be inserted to ring?
Yes, exactly. In the exit event, we have all syscalls parameters + the return value so it should be enough to collect everything in the exit event, since the enter event doesn't bring any additional value
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
@oheifetz are you actively working on this? Otherwise i'd like to tackle it :)
Hey @FedeDP and @oheifetz, don't know if there has been any progress on this, but I find myself in need of getting date from both sendmmsg and recvmmsg, so I'll likely have to implement them myself for testing unless a PR is opened by either of you guys in the near future.
Just letting you know in case you want me to take over the issue.
Np from me, indeed thank you Mauro! Waiting for @oheifetz to check his status on this!
Ehy @Molter73 since @oheifetz is not answering anymore, i think you can take this one :) Are you willing to? Thanks!
Sorry for the delay in the response, my priorities have shifted a bit for the next week or so, but I should be able to pick this up afterwards if that's ok.
In the meantime, I've had moderate success on a very naive implementation for the modern probe by basically translating every sendmmsg/recvmmsg into multiple sendmsg/recvmsg events, I know it is not the approach discussed earlier, but it made the userspace implementation really straight forward (almost no changes needed). Would like to hear thoughts on this approach from more knowledgeable people, would this be appropriate or should I go back to issuing the proper events? What do folks think?
No problem for the delay! I would love to get this in Falco 0.39 so we got plenty of time :)
In the meantime, I've had moderate success on a very naive implementation for the modern probe by basically translating every sendmmsg/recvmmsg into multiple sendmsg/recvmsg events
Mmmh this means we pay the push cost (+ header size of the event) multiple times for a single event? I don't think that is good from a perf pov. Moreover, given that libs are completely syscall-driven as of now, we should implement the syscall under its own event so that one can filter for it. At the same time, i don't know how hard it will be to implement these syscalls, therefore...i summon @Andreagit97 to ask for his opinions :)
Mmmh this means we pay the push cost (+ header size of the event) multiple times for a single event? I don't think that is good from a perf pov.
This behavior is the same that you would get on an older system that doesn't have these syscalls, but it's a fair concern. I thought you would be more worried with the cost of pushing the received/sent data rather than the header, the implementation I propose would apply snaplen to each individual message rather than to the whole of the syscall, but I also don't know which is the correct/preferred implementation. When running with snaplen=0, we would only be pushing the header, fd, socket tuple and a handful of other things, so I would not assume it'd be a huge deal.
Either way, I'd like to hear from Andrea as well.
Hey folks! I have to admit this is a tough call...
Let's say that from the performance point of view having just one event would be better for 2 reasons:
- kernel side we just need to push one event instead of many.
- in userspace we need to evaluate the rules on just one event (this is the expensive part).
BTW what is not completely clear to me is how we would handle this unique event in userspace. If I recall correctly the sendmmsg in the case of UDP, can send many events to different destinations. If this is the case we will end up with not only an array of messages but also an array of tuples. So when we call fd.name which tuple should we return? Probably we need to invent a new logic to handle this case.
Please note that I'm not 100% sure that the above case is possible we should verify that the sendmmsg can send messages to different destinations.
So to summarize: Unique events -> better performance but difficult to manage in both userspace and kernel side Multiple events -> worse performance but easier to implement and handle in both userspace and kernel side
If the above point about multiple tuples is true, probably I would go with the multiple sendmsg events implementation.
@Biagio-Dipalma from the security point of view, do you think it is important to distinguish a sendmmsg from a sendmsg syscall? Are there attacks based only on sendmmsg syscall? I don't think so but better to ask because, with the approach of multiple events, we will lose this distinction between a sendmmsg and a sendmsg
This behavior is the same that you would get on an older system that doesn't have these syscalls, but it's a fair concern.
You are right indeed.
Unique events -> better performance but difficult to manage in both userspace and kernel side Multiple events -> worse performance but easier to implement and handle in both userspace and kernel side
I agree; thanks for the clarification!
If the above point about multiple tuples is true, probably I would go with the multiple sendmsg events implementation.
+1 then, i agree the perf penalty is worth the solution's simplicity.
Thanks both for your inputs!
Please note that I'm not 100% sure that the above case is possible we should verify that the
sendmmsgcan send messages to different destinations.
I'll try testing if sending messages to multiple destinations in a single syscall is possible ASAP, I have a simple C program that I can adapt for it, I just haven't had the time these last few days.
@Biagio-Dipalma from the security point of view, do you think it is important to distinguish a
sendmmsgfrom asendmsgsyscall? Are there attacks based only onsendmmsgsyscall? I don't think so but better to ask because, with the approach of multiple events, we will lose this distinction between asendmmsgand asendmsg
I would also like to know this, but worse case scenario, I can still send multiple messages so we can have distinct socktuples, just use the sendmmsg header, look everywhere in userspace for places where sendmsg is used and add the new one next to it. I just don't want to mess with a completely new implementation for a syscall that is basically already implemented and just needs some tweaking. 😅
It makes sense! Moreover, it is never too late to switch and implement it as a completely new syscall, for example, if we see cases in which a system throws 10 Millions/s of sendmmsg and splitting them we will obtain a throughput of 30 Mevt/s (supposing 3 events for each syscall) maybe we can consider other possible ways, but probably at the beginning i would start with something simple
Well, I finally made some time to try sending messages to multiple destinations with sendmmsg and it totally works, so some level of demultiplexing will be needed in order to have a single socktuple per event. I haven't checked if recvmmsg is able to receive from multiple sources, but I have to assume it does (I'll verify it before opening a PR).
I'll take a shot at implementing the syscall issuing multiple driver events per call, which would be the simplest. A second iteration could probably send a single event from the driver and create multiple events in userspace, but I would need to figure out what that looks like across libscap and libsinsp.
Thank you for the research Mauro!
I'll take a shot at implementing the syscall issuing multiple driver events per call, which would be the simplest. A second iteration could probably send a single event from the driver and create multiple events in userspace, but I would need to figure out what that looks like across libscap and libsinsp.
I agree to your plan!
This is embarrassing, I managed to send multiple events from the modern probe to userspace without much trouble, but now that I started trying to implement the same thing on the legacy probe and kernel module I see it's not as straight forward, mainly because a lot of the logic for adding events to the ringbuffers is abstracted away from the fillers. I'm considering jumping straight to generating a single event from the driver and demultiplexing in userspace, I figure I would need to declare the ppm_event_info for sendmmsg_x to look something like this:
[PPME_SOCKET_SENDMMSG_X] = {"sendmmsg", EC_IO_WRITE | EC_SYSCALL, EF_USES_FD | EF_WRITES_TO_FD | EF_MODIFIES_STATE, 4, {{"res", PT_ERRNO, PF_DEC}, {"fd", PT_FD, PF_DEC}, {"msgvec", PT_MMSGHDR_ARRAY, PF_NA}, {"vlen", PT_UINT32, PF_DEC} } },
Where PT_MMSGHDR_ARRAY will be a special type that will hold an array of the data extracted for each message sent, basically:
- socket tuple
- data extracted from the IO vectors
- size of the data extracted
Now the problem I have is that neither sinsp nor scap are really designed to turn a single kernel event into multiple userspace events...
So, I either implement quite a bit of custom logic in userspace for this syscall specifically, do a bunch of nasty changes to the legacy probe and kernel module for them to support sending multiple events, do just the modern probe implementation, or... Yeah, I'm out of ideas. @FedeDP, @Andreagit97, do you guys have any thoughts on how I could move this change forward?
Hey Mauro, my 2c:
do a bunch of nasty changes to the legacy probe and kernel module for them to support sending multiple events, do just the modern probe implementation
these seems the more reasonable options; let's go through each of them:
do a bunch of nasty changes to the legacy probe and kernel module for them to support sending multiple events
I would be ok with this; if we can move stuff away from using (bpf_)val_to_ring helpers, it is a win. Indeed for the old bpf probe we already dropped many data types (the most basic, like integers types) from using bpf_val_to_ring, since it causes many headaches and verifier issues.
What i mean is, if you are blocked by the fact that val_to_ring is a bit complex, we can workaround that by avoiding it entirely.
If that's not the problem, mind to share it a bit more in depth?
do just the modern probe implementation
While we always strive for feature parity between drivers, i think that if making it work on kmod and old bpf probe gets too hard, we might consider to make it a modern-ebpf only feature. Again, this should be a last resort if everything else fails miserably :laughing:
I'd completely avoid the userspace magic to demultiplex a single event into multiple if possible.
yep, I agree with @FedeDP the ideal solution would be to try to support multiple events in the kernel in all 3 drivers. I also agree to the idea of just supporting it in the modern and in the KMOD if we see that is really complex in the legacy ebpf, in the end at a certain point we should put it in maintenance mode
Thanks to both of you, I'll do a best effort implementation on all drivers and open a PR so we can move the discussion to something with more context.
I also agree to the idea of just supporting it in the modern and in the KMOD if we see that is really complex in the legacy ebpf
I can tell you right now that the implementation on the legacy probe will be incomplete due to verifier limitations, but more on that once I open the PR.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
/milestone 0.20.0 Like the linked PR.
To recap the ongoing discussion in the linked PR (thanks @Molter73 for the big effort!)
- old bpf driver only supports sending first message for both recvmmsg and sendmmsg (technological reasons)
- modern_ebpf driver supports sending up to first 8 messages for both recvmmsg and sendmmsg (again, technological reasons, see https://github.com/falcosecurity/libs/pull/2027#issuecomment-2568997393 and https://github.com/falcosecurity/libs/pull/2231/commits/600fefb36d2b0d70007bf6665103e06356d317af)
- kmod driver supports sending all messages
EDIT: after https://github.com/falcosecurity/libs/pull/2255, modern_ebpf driver supports:
- either up to 1024 messages through bpf_loop(), where available
- else, it just sends the first one like the old eBPF probe
Unfortunately, capping to 8 messages as proposed before was not enough to avoid the verifier getting angry in some cases.