PrincsFilter to ClientFilter reimplementation
Client filter
Filtering modes:
Only: the subscription will only be shown to the listed clientsExcept: the subscription will be shown to everyone except the listed clients
Filtering types:
KerberosPrinc: the filter will be evaluated on the Kerberos principalTLSCertSubject: the filter will be evaluated on the TLS certificate's subject fieldMachineID: the filtering is done based on the name of the computer
The default is either KerberosPrinc or TLSCertSubject, depending on how server authentication is configured.
Warning: MachineID is not cryptographically authenticated information, it can be spoofed.
Filtering flags:
GlobPattern: Glob patterns like*and?can be used intargetsCaseInsensitive: Filter matching will be case-insensitive
Flags are composable using the | operator.
The comparison is case-sensitive by default.
Backward compatibility:
- case-sensitive behavior remains default
- the deprecated CLI tool operates in compat mode, it will continue to work with old filters, but it can cause confusion if someone starts using both subscription config files (with new filters) and the CLI tool to modify filters.
- should we implement the
V3Import/Export API?
Hi! Thanks, again, for this cool PR!
Let's start this (huge) answer with a some history! At first, OpenWEC only supported Kerberos authentication. The Kerberos principal filter was implemented to mimic the subscription ACL of MS WEC, even if explicitly listing principals is quite tedious. I had some plans to parse the Kerberos PAC at some point to enable filtering on AD groups or claims, but I never found the time (and the motivation) to do it. I later added the custom "uri" system to manage which subscriptions are available to which machines, depending on the configuration of the machines. When TLS authentication was implemented, we extended the "Kerberos principal filter" to support TLS subjects, but we did not change the name out of laziness (sorry).
In practice, my organization used custom "uri" to tag events based on the location of the machines, but we stopped doing that at some point. We never used the prinicpal/subject filtering feature. I assume you have a specific use case in mind. Could you share it?
cert_subjects alias
I do agree that the name princ is very unfriendly since the support of TLS was introduced. I wanted to change it to machine to be less "Kerberos" specific, but I never took the time to do it. Your solution of aliasing the field in configuration files looks good for now (but see my thoughts at the end).
glob patterns
I agree that using glob patterns maybe useful in some cases, but I'm not a big fan of the has_wildcard function to distinguish between "normal" strings and "glob" strings. I see two ways to handle this:
- The user must say, for each string or for all "strings" at once, whether it is a glob pattern or a normal string. This puts the burden on the user, but he is the only one who knows what he really wants.
- We consider all strings to be glob patterns. It might have a slight impact on performance, but I think it should be negligible. However, there may be issues where a user expects the filter to explicitly match a string that OpenWEC interprets as a glob pattern, leading to unexpected behaviors.
MachineID filter
There is a huge difference between authentication level filters and "machine controlled fields" filters. I decided to implement only authentication level filters because they can be trusted. On the other side, machine controlled fields (such as "MachineID") can be spoofed very easily (see https://github.com/cea-sec/openwec/blob/main/doc/issues.md#hunting-rogue-windows-event-forwarder). There is no mechanism to ensure that MachineID or Hostname matches the Kerberos principal or the TLS subject name.
Do you have a specific use case where you can't filter by subject/principal/uri?
Filter rework
Before today, I considered subscriptions filter to be a legacy feature that still works. However, if they are really being used and if they need to be improved/changed, then I think we should rebuild them from the ground instead of trying to add some patches here and there.
Currently, each subscription can only have a single filter which consists of an "operation" (None/Only/Except) and a list of values wrongly named "princs" :
pub enum PrincsFilterOperation {
Only,
Except,
}
pub struct PrincsFilter {
operation: Option<PrincsFilterOperation>,
princs: HashSet<String>,
}
We could:
- rename "princ" to "values" or "trustees".
- add an enum field "type" which could be equal to "kerberos principal", "tls subject", "machine id", ...
- maybe add another field named "flags" that would specify whether the comparison is case sensitive, whether the values are patterns, ..., maybe implemented using
bitflags? - also, I can't remember why I decided that
SubscriptionData.princs_filterwould be aPrincsFilter(and usingPrincsFilter.operation = Noneas a noop) instead of anOption<PrincsFilter>, so maybe we could change that too.
That would lead to a PrincsFilter struct that looks like this:
pub struct PrincsFilter {
operation: PrincsFilterOperation,
type: FilterType,
flags: FilterFlags,
trustees: HashSet<String>,
}
To go further, we could make it work like a Windows "ACL", where each "ACE" would have an operation type (allow, deny), a trustee type (kerberos principal, tls subject, MachineID, ...), some flags, and a trustees list. Then, SubscriptionData.PrincsFilter would be a (possibly empty) list of filters.
What do you think?
Thank you for the detailed and thorough explanation. :)
I also had a feeling that something bigger needs to be done around the current filtering implementation, so I really like your idea. The case-insensitivity option is also something that I find very useful, I would even consider changing the default (but only with the new config naming, I don't want to break anything when one uses the old option names, of course).
In our use-cases, machine filtering is essential, as events are coming from very different domains, locations, etc. towards the same collector, and even with wildcards, it is sometimes non-trivial to achieve the the desired categorization when using TLS.
The URI trick sounds good, but managing "what we collect and from where" would be better being stored on the server side, we would like to keep everything unified and stupid on the client side.
So all things considered, I would gladly continue this implementation in the direction you suggested as "filter rework". (Sorry for the typo, I usually wrap up conversations terribly, so I asked for a little help :))
About the MachineID part:
I don't have a specific use case that would not work with principals or cert subjects, I just had a fairly old memory when I was investigating the Microsoft implementation of WEC that in case of the source-initiated push method with TLS, their wildcard filters might have been applied on the Machine field of the request and not on the cert DN or common name.
I don't have a specific use case that would not work with principals or cert subjects, I just had a fairly old memory when I was investigating the Microsoft implementation of WEC that in case of the source-initiated push method with TLS, their wildcard filters might have been applied on the Machine field of the request and not on the cert DN or common name.
Ok. I'm fine with adding such a filter as long as the documentation clearly states that the value can be manipulated by an evil machine.
So all things considered, I would gladly continue this implementation in the direction you suggested as "filter rework".
That's nice! I will be happy to help you if you need it :smile: Do you plan to implement the "single filter" approach or the last one with multiple filters?
I would go with the single filter list and try to implement it in a way that wouldn't cause too many problems if one wanted to extend the functionality to the ACL-like idea you mentioned.
The single filter list with flags (case-sensitivity, pattern) and a type field covers all my use cases.
Sorry for the delay, I'll update the PR soon.
@vruello I updated the PR description.
I'm still learning Rust, and it seems I have made a few relatively bad decisions, so the implementation does not look as clean as I wanted it to be.
Using static or dynamic dispatch through a ClientFilter trait and with 2 implementations would have ended up being cleaner.
Hi! Thanks for all your work on this feature! Do you think it is ready for review?
My pleasure. :)
Do you think it is ready for review?
Yes, I think it is ready.
If you find this enum-based implementation terrible, please let me know and I will try the static or dynamic dispatch alternatives.
Cool! I will check it out as soon as possible :smile:
Do you have an opinion on this? (we need to decide this before (re-)working on the implementation)
I think it would be unnecessary (at least in our use cases) to allow per-target types and flags. Specifying this per-target in the configuration would add a lot of boilerplates, so we would need to support a "default" as well to avoid repetition. It would be too much in my opinion.
My reasoning is something like this: Filtering should be uniform and easy to understand. Enabling/disabling wildcards for specific entries may sound reasonable, but playing with case-sensitivity and where the filter input comes from (MachineID vs TLS/Kerberos) seems inconsistent and hard to oversee. If one wanted something like this, I would say a separate subscription would be reasonable (with the side note that I think this won't happen in 90% of the use cases, because most people will just decide on a unified filtering mechanism).
Your reasoning is quite convincing. We are probably in one of those cases where perfect is the enemy of good, and supporting limited filtering capabilities is probably the best choice for now. We'll always be able to add a more complete (and complex) filtering syntax later, if the need arises.
@vruello Sorry for the huge delay. I think I've resolved all review notes.
@vruello If you have some free time, could you take a look, please? :)
We'll integrate this PR into our fork the upcoming week and start using/testing it.
@vruello Do you think there's a chance this could be merged upstream?
Hi! I squashed your commits, rebased on upstream and made some changes. The most significant change is replacing the filtering types KerberosPrinc and TLSCertSubject with Client, meaning that the filter will be applied to the client identifier, which depends on the authentication method used.
I need some time to review it one last time but it should be ready for merging soon.