libs
libs copied to clipboard
wip: fix: PT_FD, PT_ERRNO, PT_PID, PT_FDLIST with int32
What type of PR is this?
/kind cleanup
Any specific area of the project related to this PR?
Uncomment one (or more)
/area <>lines:
/area API-version
/area build
/area driver-kmod
/area driver-bpf
/area driver-modern-bpf
/area libscap-engine-bpf
/area libscap-engine-gvisor
/area libscap-engine-kmod
/area libscap-engine-modern-bpf
/area libscap-engine-nodriver
/area libscap-engine-noop
/area libscap-engine-source-plugin
/area libscap-engine-udig
/area libscap
/area libpman
/area libsinsp
/area tests
/area proposals
Does this PR require a change in the driver versions?
This will probably need a maj schema bump.
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
cleanup: send fd, errno, pid and fdlist as int32 from drivers to userspace
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: FedeDP
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [FedeDP]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Missing:
- tests
- gvisor stuff (ask @LucaGuerra )
This should fix first 2 points of #515 !
/retest
Another possible solution, discussed with @Andreagit97 , is to add a new event flag, like EF_USE_SMALL_SIZE (ok, better naming perhaps ahha).
When an event is flagged with it, libs should expect that driver pushes a 32b integers for errno, pid,... but stores them in a 64bit (to retain backward compatibility with current behavior); moreover, when reading an old scap file, the flag won't be present, so the old behavior would still be enforced and everything should work fine.
I will rebase this one (or better say rewrite from scratch basically :rofl: ) once #596 , #649 and #582 are (hopefully)merged, to avoid huge conflicts.
I took another look at this, I have some remarks that could be useful when you refactor/rewrite/rebase this PR:
I'm fine with FDs but keep in mind that if we want to create versions of events with a 32-bit PID/TID (for clone, fork, execve for example) we'll have to still maintain 64 bit versions since we can now rely on them (we do in gVisor). So, you can't deprecate or mark as "old" or "backwards compatibility" the 64 bit event but they'll be two different current events which could make the whole thing a bit more complicated.
Since the events that use PIDs are not a lot and not so frequent, except maybe for switch, I would advise to change them only if there is a real big impact on performance. I understand that there is such improvement with FDs but I'm not that convinced about PIDs.
/milestone 0.10.0
@LucaGuerra i think your idea might be ok, starting with errno and fd (+ fdlist) is fine, they are the most impactful ones for sure.
/milestone 0.11.0
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
I keep this open for reference, but most probably the patch will need to be entirely rewritten ;)
/milestone 0.12.0
@FedeDP it turns out for loginuid the casting to s32 is causing trouble. Real-world production systems do have audit uids larger than that. Dilemma arises from official definitions being somehow both u32 or s32 and folks typically like -1 for invalid values.
Therefore wanted to ask you if we could revive this PR and create the new events throughout and then for the new execve* events I can push onto this branch with changes to support loginuid changes, see staging commit https://github.com/incertum/libs/commit/b171a2486bb0093ee6b2e41d609322cf73307171
we can go with u32 and re-convert -1 in userspace or we just throughout go with s64, either way fine by me as long as it fixes wrongly casted uids. We definitely need to keep -1 ...
FWIW, in the kernel uids are unsigned 32 bit (on x86-64 at least): https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/posix_types.h#L49 so using s64 is IMHO wrong precisely because it introduces a difference between (2 << 32) - 1 and -1
FWIW, in the kernel uids are unsigned 32 bit (on x86-64 at least): https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/posix_types.h#L49 so using s64 is IMHO wrong precisely because it introduces a difference between
(2 << 32) - 1and-1
oh thanks @gnosek 🙏 meaning we can only go the route u32, how would you still ensure we expose -1 in userspace as we currently do?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
maybe we could also close this, not sure we will tackle this in the short term
I will leave this open just to remind me that sooner or later this should be tackled :D /remove-lifecycle stale
I will leave this open just to remind me that sooner or later this should be tackled :D
@FedeDP more concrete plans? 🙃 ty!
In my opinion we can close this one :thinking:
Yep this needs to be re-done from scratch. /close
@FedeDP: Closed this PR.
In response to this:
Yep this needs to be re-done from scratch. /close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.