libs icon indicating copy to clipboard operation
libs copied to clipboard

wip: fix: PT_FD, PT_ERRNO, PT_PID, PT_FDLIST with int32

Open FedeDP opened this issue 2 years ago • 9 comments

What type of PR is this?

/kind cleanup

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-udig

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

This will probably need a maj schema bump.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

cleanup: send fd, errno, pid and fdlist as int32 from drivers to userspace

FedeDP avatar Aug 02 '22 08:08 FedeDP

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

poiana avatar Aug 02 '22 08:08 poiana

Missing:

  • tests
  • gvisor stuff (ask @LucaGuerra )

FedeDP avatar Aug 02 '22 08:08 FedeDP

This should fix first 2 points of #515 !

FedeDP avatar Aug 02 '22 08:08 FedeDP

/retest

Andreagit97 avatar Aug 09 '22 08:08 Andreagit97

Another possible solution, discussed with @Andreagit97 , is to add a new event flag, like EF_USE_SMALL_SIZE (ok, better naming perhaps ahha). When an event is flagged with it, libs should expect that driver pushes a 32b integers for errno, pid,... but stores them in a 64bit (to retain backward compatibility with current behavior); moreover, when reading an old scap file, the flag won't be present, so the old behavior would still be enforced and everything should work fine.

FedeDP avatar Sep 27 '22 09:09 FedeDP

I will rebase this one (or better say rewrite from scratch basically :rofl: ) once #596 , #649 and #582 are (hopefully)merged, to avoid huge conflicts.

FedeDP avatar Oct 04 '22 07:10 FedeDP

I took another look at this, I have some remarks that could be useful when you refactor/rewrite/rebase this PR:

I'm fine with FDs but keep in mind that if we want to create versions of events with a 32-bit PID/TID (for clone, fork, execve for example) we'll have to still maintain 64 bit versions since we can now rely on them (we do in gVisor). So, you can't deprecate or mark as "old" or "backwards compatibility" the 64 bit event but they'll be two different current events which could make the whole thing a bit more complicated.

Since the events that use PIDs are not a lot and not so frequent, except maybe for switch, I would advise to change them only if there is a real big impact on performance. I understand that there is such improvement with FDs but I'm not that convinced about PIDs.

LucaGuerra avatar Oct 04 '22 08:10 LucaGuerra

/milestone 0.10.0

FedeDP avatar Oct 05 '22 14:10 FedeDP

@LucaGuerra i think your idea might be ok, starting with errno and fd (+ fdlist) is fine, they are the most impactful ones for sure.

FedeDP avatar Oct 05 '22 15:10 FedeDP

/milestone 0.11.0

FedeDP avatar Dec 01 '22 10:12 FedeDP

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Mar 02 '23 15:03 poiana

/remove-lifecycle stale

I keep this open for reference, but most probably the patch will need to be entirely rewritten ;)

FedeDP avatar Mar 02 '23 16:03 FedeDP

/milestone 0.12.0

FedeDP avatar Apr 05 '23 07:04 FedeDP

@FedeDP it turns out for loginuid the casting to s32 is causing trouble. Real-world production systems do have audit uids larger than that. Dilemma arises from official definitions being somehow both u32 or s32 and folks typically like -1 for invalid values.

Therefore wanted to ask you if we could revive this PR and create the new events throughout and then for the new execve* events I can push onto this branch with changes to support loginuid changes, see staging commit https://github.com/incertum/libs/commit/b171a2486bb0093ee6b2e41d609322cf73307171

we can go with u32 and re-convert -1 in userspace or we just throughout go with s64, either way fine by me as long as it fixes wrongly casted uids. We definitely need to keep -1 ...

incertum avatar Jun 28 '23 19:06 incertum

FWIW, in the kernel uids are unsigned 32 bit (on x86-64 at least): https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/posix_types.h#L49 so using s64 is IMHO wrong precisely because it introduces a difference between (2 << 32) - 1 and -1

gnosek avatar Jun 28 '23 20:06 gnosek

FWIW, in the kernel uids are unsigned 32 bit (on x86-64 at least): https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/posix_types.h#L49 so using s64 is IMHO wrong precisely because it introduces a difference between (2 << 32) - 1 and -1

oh thanks @gnosek 🙏 meaning we can only go the route u32, how would you still ensure we expose -1 in userspace as we currently do?

incertum avatar Jun 28 '23 20:06 incertum

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Sep 26 '23 21:09 poiana

maybe we could also close this, not sure we will tackle this in the short term

Andreagit97 avatar Sep 27 '23 08:09 Andreagit97

I will leave this open just to remind me that sooner or later this should be tackled :D /remove-lifecycle stale

FedeDP avatar Sep 27 '23 09:09 FedeDP

I will leave this open just to remind me that sooner or later this should be tackled :D

@FedeDP more concrete plans? 🙃 ty!

incertum avatar Nov 13 '23 04:11 incertum

In my opinion we can close this one :thinking:

Andreagit97 avatar Nov 13 '23 07:11 Andreagit97

Yep this needs to be re-done from scratch. /close

FedeDP avatar Nov 13 '23 08:11 FedeDP

@FedeDP: Closed this PR.

In response to this:

Yep this needs to be re-done from scratch. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

poiana avatar Nov 13 '23 08:11 poiana