tracee
tracee copied to clipboard
[REFAC] Add support for previous Types versions
Prerequisites
Select one OR another:
- [x] I have discussed the refactoring idea with one (or another) maintainer.
- [ ] I'll create a PR to implement this refactoring idea (assign to yourself).
- [ ] Someone else should implement this (describe it well).
Refactoring description
Currently, whenever the trace.Event
struct changes, tracee-rules support for it is broken.
This is problematic because it means that tracee-ebpf recordings are not loadable anymore by signatures shortly after.
We need to create a protocol to add backward compatibility to the struct.
Proposition
We can maybe add 2 fields to the Protocol
struct - Type
and Version
.
Then, using the Version
filed we can call the right decoder function.
We will than make each new version implement a function that transform the previous version struct to the new one.
This way, each new version is compatible to all previous versions by only implemented the upgrade function from previous version (because all older versions are upgradeable to previous version).
We can do it by adding the Upgrade
method to the Event
interface, and calling it each time until getting the right version.
@AlonZivony could you provide an example of breakage ? When it happened recently, which commit, etc ?
Also, the work started the go-cel proof-of-concept, with a protobuf wrapper, started to strongly type the types evaluated by the rules, so we wouldn't have runtime issues (but startup ones, easier to fix).
Example at: https://github.com/aquasecurity/tracee/blob/main/pkg/rules/celsig/wrapper/event.pb.go and https://github.com/aquasecurity/tracee/blob/main/pkg/rules/celsig/library.go#L62.
I will soon send the error received and the other information.
We thought about maybe just sending the tracee commit used in each record (using something similar to the init_namespaces
event, but for tracee_metadata
).
This way the user can just get the commit and checkout to it, than compile tracee-rules and run its rules there.
WDYT?
@NDStrahilevitz your input here pls. How did we plan to advertise the "protocol" version ? I believe this is important for the golang signatures, working as plugins and using types package only ?
My understanding is that we planned to version the "protocol" to dig with Tracee event variations (iirc we said this at some point). I know that the "protocol" idea is stoped now because of other priorities as well.
- The solution should include, as Alon said, a
Type
andVersion
field,Type
should be just constanttrace.Event
or something similar. In addition we need to make tracee-rules receive protocol.Events from the get go, these fields might help implement that. - We may want to implement version negotiation between tracee-rules and tracee-ebpf, this can be done as part of the gRPC effort.
- We should start versioning the types module, so we should have
types/v1
types/v2
etc, versioning as we go as needed. As Alon said, in each of those modules, we can add someUpgrade
method, I think those don't need to be in some recursive implementation, I think a better approach be to either haveUpgradeV2
orDowngradeV1
style methods hardcoded per module version. - Importantly, we should avoid incrementing this too often, and we can do that by finding what kind of changes breaks tracee-rules.
- @AlonZivony, we need an example of what kind of change broke compatibility, will even just adding a new field break compatibility? Or only new fields with new types (for example your new
KernelFlags
?)
Side note: @rafaeldtinoco, if we want to support multiple types, we can't actually use the hardcoded protobuf we use in go-cel, we would need that protobuf for emitting from tracee-ebpf but tracee-rules would need a protobuf for the protocol.Event
not the trace.Event
. Unless, we make the choice to give up genericness from event sources, and then we should opt to change the body type of protocol.Event
to trace.Event
from interface{}
.
Unless, we make the choice to give up genericness from event sources, and then we should opt to change the body type of
protocol.Event
totrace.Event
frominterface{}
.
That was the path thought when Daniel and I were discussing.
But roadmap currently prioritizes golang signatures only. Also, I don't think we should drop go-cel, or rego, until we are fully certain... so we will have to find something that works for now (probably just extend/maintain the wrapper) and decide removing genericness later (and also make sure that any versioning allows rego/go-cel to work (for go-cel, at least what we currently have).
That was the path thought when Daniel and I were discussing.
But roadmap currently prioritizes golang signatures only. Also, I don't think we should drop go-cel, or rego, until we are fully certain... so we will have to find something that works for now (probably just extend/maintain the wrapper) and decide removing genericness later (and also make sure that any versioning allows rego/go-cel to work (for go-cel, at least what we currently have).
Can't go-cel work with the protobuf Any
type? Currently for our protocol.Event
that's what we should use in a protobuf. We would need to check that in go-cel. Then we won't have to give up genericness (relevant since currently CNDR relies on it for data storage).
Anyway, we need to know the severity of this problem long term (what causes breaking) to see if this (versioning) is worth putting in the current roadmap as a priority.
Also there is another question here, is the break occurring in signatures or the engine? If signatures themselves are breaking, they may need an internal versioning of their own.
Can't go-cel work with the protobuf Any type? Currently for our protocol.Event that's what we should use in a protobuf. We would need to check that in go-cel. Then we won't have to give up genericness (relevant since currently CNDR relies on it for data storage).
I have all my discussions with Daniel recorded (4 to 6 hours), we can revisit them and his reasons for dropping genericness. His first approach was "string evaluation for all types" and we shifted away from that idea (but you can evaluate either using strings or specific types).
Anyway, we need to know the severity of this problem long term (what causes breaking) to see if this (versioning) is worth putting in the current roadmap as a priority.
That might actually be true if we consider we have as a priority the golang signatures experience (as plugins ?). Because this problem will also be true for anyone relying in that future (the breakage).