tracee
tracee copied to clipboard
tracee event context has type overflows
Description
While aware of https://github.com/aquasecurity/tracee/issues/2870, we still need to support/fix current tracee event context, as it has data types overflows that are likely causing context to be lost (all uint32 values, from the eBPF context, might be overflowed when decoded into Event struct, such as the PIDNS value, declared as int, which defaults to very high numbers in my env).
The new
event.prototakes care of correct types (usinggoogle.protobuf.UInt32Valuein the right places, kudos to @josedonizetti).
More details
While playing with some marshalling and unmarshalling ideas (for other output printers, for example), I got the following error after a event -> json -> parquet conversion:
{
"level": "error",
"ts": 1700017058.446953,
"msg": "Error encoding event to parquet",
"error": "integer overflow on token 4026531841",
"event": {
"timestamp": 1700017058317424171,
"threadStartTime": 1699918738371868249,
"processorId": 7,
"processId": 463085,
"cgroupId": 4794,
"threadId": 463085,
"parentProcessId": 462937,
"hostProcessId": 463085,
"hostThreadId": 463085,
"hostParentProcessId": 462937,
"userId": 1000,
"mountNamespace": 4026531841,
"pidNamespace": 4026531836,
"processName": "code",
"executable": {
"path": ""
},
"hostName": "rugged",
"containerId": "",
"container": {},
"kubernetes": {},
"eventId": "257",
"eventName": "openat",
"matchedPolicies": [
""
],
"argsNum": 4,
"returnValue": 45,
"syscall": "openat",
"stackAddresses": null,
"contextFlags": {
"containerStarted": false,
"isCompat": false
},
"threadEntityId": 1656711223,
"processEntityId": 1656711223,
"parentEntityId": 723639701,
"args": [
{
"name": "dirfd",
"type": "int",
"value": -100
},
{
"name": "pathname",
"type": "const char*",
"value": "/proc/463548/cmdline"
},
{
"name": "flags",
"type": "string",
"value": ""
},
{
"name": "mode",
"type": "mode_t",
"value": 0
}
]
}
}
and:
$ sudo lsns | grep 4026531836
4026531836 pid 316 1 root /sbin/init
and
11101111111111111111111111111100
isn't quite an unsigned 32 bit overflow but it is a signed 32 bit overflow.
Meaning that a bunch of Event fields that are int should actually be uint32:
type Context struct {
Ts uint64
StartTime uint64
CgroupID uint64
Pid uint32
Tid uint32
Ppid uint32
HostPid uint32
HostTid uint32
HostPpid uint32
Uid uint32
MntID uint32
PidID uint32
Comm [16]byte
UtsName [16]byte
Flags uint32
LeaderStartTime uint64
ParentStartTime uint64
EventID events.ID // int32
Syscall int32
MatchedPolicies uint64
Retval int64
StackID uint32
ProcessorId uint16
_ [2]byte // padding
}
type Event struct {
Timestamp int `json:"timestamp"`
ThreadStartTime int `json:"threadStartTime"`
ProcessorID int `json:"processorId"`
ProcessID int `json:"processId"`
CgroupID uint `json:"cgroupId"`
ThreadID int `json:"threadId"`
ParentProcessID int `json:"parentProcessId"`
HostProcessID int `json:"hostProcessId"`
HostThreadID int `json:"hostThreadId"`
HostParentProcessID int `json:"hostParentProcessId"`
UserID int `json:"userId"`
MountNS int `json:"mountNamespace"`
PIDNS int `json:"pidNamespace"`
ProcessName string `json:"processName"`
Executable File `json:"executable"`
HostName string `json:"hostName"`
ContainerID string `json:"containerId"`
Container Container `json:"container,omitempty"`
Kubernetes Kubernetes `json:"kubernetes,omitempty"`
EventID int `json:"eventId,string"`
EventName string `json:"eventName"`
PoliciesVersion uint16 `json:"-"`
MatchedPoliciesKernel uint64 `json:"-"`
MatchedPoliciesUser uint64 `json:"-"`
MatchedPolicies []string `json:"matchedPolicies,omitempty"`
ArgsNum int `json:"argsNum"`
ReturnValue int `json:"returnValue"`
Syscall string `json:"syscall"`
StackAddresses []uint64 `json:"stackAddresses"`
ContextFlags ContextFlags `json:"contextFlags"`
ThreadEntityId uint32 `json:"threadEntityId"` // thread task unique identifier (*)
ProcessEntityId uint32 `json:"processEntityId"` // process unique identifier (*)
ParentEntityId uint32 `json:"parentEntityId"` // parent process unique identifier (*)
Args []Argument `json:"args"` // args are ordered according their appearance in the original event
Metadata *Metadata `json:"metadata,omitempty"`
}
The event struct dates from 2020 and since the beginning has uin32 types declared as ints. We probably never payed hard attention because most of the values don't overflow often (and if they do we wouldn't catch in any test I believe). But the default PID namespaces are overflowing in my environments. It happened by accident because Parquet does not support unsigned integers on its data format.
Is this awaiting any related changes or can I tackle the Event int change to uint32?
Is this awaiting any related changes or can I tackle the
Eventintchange touint32?
Please do tackle this. Make sure to test it extensively (including internal E2E) because Im a bit afraid of this change. Let me know if you need anything. Thanks a lot!
After offline discussion we decided to wait the Event be replaced by https://github.com/aquasecurity/tracee/blob/main/api/v1beta1/event.proto#L11 which already tackled the type normalisation. @josedonizetti please refer (close) this in your upcoming work. Thanks.
@geyslan @rafaeldtinoco yeah, I think waiting is for the best now since it seems we will migrate the event structure for the 0.20.0 release, but if we see it is not the case in the next few weeks, this can should indeed be fixed.