openwec PrincsFilter to ClientFilter reimplementation

Client filter

Filtering modes:

Only: the subscription will only be shown to the listed clients
Except: the subscription will be shown to everyone except the listed clients

Filtering types:

KerberosPrinc: the filter will be evaluated on the Kerberos principal
TLSCertSubject: the filter will be evaluated on the TLS certificate's subject field
MachineID: the filtering is done based on the name of the computer

The default is either KerberosPrinc or TLSCertSubject, depending on how server authentication is configured.

Warning: MachineID is not cryptographically authenticated information, it can be spoofed.

Filtering flags:

GlobPattern: Glob patterns like * and ? can be used in targets
CaseInsensitive: Filter matching will be case-insensitive

Flags are composable using the | operator. The comparison is case-sensitive by default.

Backward compatibility:

case-sensitive behavior remains default
the deprecated CLI tool operates in compat mode, it will continue to work with old filters, but it can cause confusion if someone starts using both subscription config files (with new filters) and the CLI tool to modify filters.
should we implement the V3 Import/Export API?

Oct 25 '24 22:10 MrAnno

Hi! Thanks, again, for this cool PR!

Let's start this (huge) answer with a some history! At first, OpenWEC only supported Kerberos authentication. The Kerberos principal filter was implemented to mimic the subscription ACL of MS WEC, even if explicitly listing principals is quite tedious. I had some plans to parse the Kerberos PAC at some point to enable filtering on AD groups or claims, but I never found the time (and the motivation) to do it. I later added the custom "uri" system to manage which subscriptions are available to which machines, depending on the configuration of the machines. When TLS authentication was implemented, we extended the "Kerberos principal filter" to support TLS subjects, but we did not change the name out of laziness (sorry).

In practice, my organization used custom "uri" to tag events based on the location of the machines, but we stopped doing that at some point. We never used the prinicpal/subject filtering feature. I assume you have a specific use case in mind. Could you share it?

`cert_subjects` alias

I do agree that the name princ is very unfriendly since the support of TLS was introduced. I wanted to change it to machine to be less "Kerberos" specific, but I never took the time to do it. Your solution of aliasing the field in configuration files looks good for now (but see my thoughts at the end).

glob patterns

I agree that using glob patterns maybe useful in some cases, but I'm not a big fan of the has_wildcard function to distinguish between "normal" strings and "glob" strings. I see two ways to handle this:

The user must say, for each string or for all "strings" at once, whether it is a glob pattern or a normal string. This puts the burden on the user, but he is the only one who knows what he really wants.
We consider all strings to be glob patterns. It might have a slight impact on performance, but I think it should be negligible. However, there may be issues where a user expects the filter to explicitly match a string that OpenWEC interprets as a glob pattern, leading to unexpected behaviors.

`MachineID` filter

There is a huge difference between authentication level filters and "machine controlled fields" filters. I decided to implement only authentication level filters because they can be trusted. On the other side, machine controlled fields (such as "MachineID") can be spoofed very easily (see https://github.com/cea-sec/openwec/blob/main/doc/issues.md#hunting-rogue-windows-event-forwarder). There is no mechanism to ensure that MachineID or Hostname matches the Kerberos principal or the TLS subject name.

Do you have a specific use case where you can't filter by subject/principal/uri?

Filter rework

Before today, I considered subscriptions filter to be a legacy feature that still works. However, if they are really being used and if they need to be improved/changed, then I think we should rebuild them from the ground instead of trying to add some patches here and there.

Currently, each subscription can only have a single filter which consists of an "operation" (None/Only/Except) and a list of values wrongly named "princs" :

pub enum PrincsFilterOperation {
    Only,
    Except,
}
pub struct PrincsFilter {
    operation: Option<PrincsFilterOperation>,
    princs: HashSet<String>,
}

We could:

rename "princ" to "values" or "trustees".
add an enum field "type" which could be equal to "kerberos principal", "tls subject", "machine id", ...
maybe add another field named "flags" that would specify whether the comparison is case sensitive, whether the values are patterns, ..., maybe implemented using bitflags ?
also, I can't remember why I decided that SubscriptionData.princs_filter would be a PrincsFilter (and using PrincsFilter.operation = None as a noop) instead of an Option<PrincsFilter>, so maybe we could change that too.

That would lead to a PrincsFilter struct that looks like this:

pub struct PrincsFilter {
    operation: PrincsFilterOperation,
    type: FilterType,
    flags: FilterFlags,
    trustees: HashSet<String>,
}

To go further, we could make it work like a Windows "ACL", where each "ACE" would have an operation type (allow, deny), a trustee type (kerberos principal, tls subject, MachineID, ...), some flags, and a trustees list. Then, SubscriptionData.PrincsFilter would be a (possibly empty) list of filters.

What do you think?

Oct 26 '24 11:10 vruello

Thank you for the detailed and thorough explanation. :)

I also had a feeling that something bigger needs to be done around the current filtering implementation, so I really like your idea. The case-insensitivity option is also something that I find very useful, I would even consider changing the default (but only with the new config naming, I don't want to break anything when one uses the old option names, of course).

In our use-cases, machine filtering is essential, as events are coming from very different domains, locations, etc. towards the same collector, and even with wildcards, it is sometimes non-trivial to achieve the the desired categorization when using TLS.

The URI trick sounds good, but managing "what we collect and from where" would be better being stored on the server side, we would like to keep everything unified and stupid on the client side.

So all things considered, I would gladly continue this implementation in the direction you suggested as "filter rework". (Sorry for the typo, I usually wrap up conversations terribly, so I asked for a little help :))

Oct 26 '24 11:10 MrAnno

About the MachineID part:

I don't have a specific use case that would not work with principals or cert subjects, I just had a fairly old memory when I was investigating the Microsoft implementation of WEC that in case of the source-initiated push method with TLS, their wildcard filters might have been applied on the Machine field of the request and not on the cert DN or common name.

Oct 26 '24 12:10 MrAnno

I don't have a specific use case that would not work with principals or cert subjects, I just had a fairly old memory when I was investigating the Microsoft implementation of WEC that in case of the source-initiated push method with TLS, their wildcard filters might have been applied on the Machine field of the request and not on the cert DN or common name.

Ok. I'm fine with adding such a filter as long as the documentation clearly states that the value can be manipulated by an evil machine.

So all things considered, I would gladly continue this implementation in the direction you suggested as "filter rework".

That's nice! I will be happy to help you if you need it :smile: Do you plan to implement the "single filter" approach or the last one with multiple filters?

Oct 26 '24 14:10 vruello

I would go with the single filter list and try to implement it in a way that wouldn't cause too many problems if one wanted to extend the functionality to the ACL-like idea you mentioned.

The single filter list with flags (case-sensitivity, pattern) and a type field covers all my use cases.

Oct 29 '24 14:10 MrAnno

Sorry for the delay, I'll update the PR soon.

Dec 03 '24 10:12 MrAnno

@vruello I updated the PR description.

I'm still learning Rust, and it seems I have made a few relatively bad decisions, so the implementation does not look as clean as I wanted it to be. Using static or dynamic dispatch through a ClientFilter trait and with 2 implementations would have ended up being cleaner.

Dec 13 '24 11:12 MrAnno

Hi! Thanks for all your work on this feature! Do you think it is ready for review?

Dec 13 '24 14:12 vruello

My pleasure. :)

Do you think it is ready for review?

Yes, I think it is ready. If you find this enum-based implementation terrible, please let me know and I will try the static or dynamic dispatch alternatives.

Dec 13 '24 15:12 MrAnno

Cool! I will check it out as soon as possible :smile:

Dec 13 '24 15:12 vruello

Do you have an opinion on this? (we need to decide this before (re-)working on the implementation)

I think it would be unnecessary (at least in our use cases) to allow per-target types and flags. Specifying this per-target in the configuration would add a lot of boilerplates, so we would need to support a "default" as well to avoid repetition. It would be too much in my opinion.

My reasoning is something like this: Filtering should be uniform and easy to understand. Enabling/disabling wildcards for specific entries may sound reasonable, but playing with case-sensitivity and where the filter input comes from (MachineID vs TLS/Kerberos) seems inconsistent and hard to oversee. If one wanted something like this, I would say a separate subscription would be reasonable (with the side note that I think this won't happen in 90% of the use cases, because most people will just decide on a unified filtering mechanism).

Dec 20 '24 13:12 MrAnno

Your reasoning is quite convincing. We are probably in one of those cases where perfect is the enemy of good, and supporting limited filtering capabilities is probably the best choice for now. We'll always be able to add a more complete (and complex) filtering syntax later, if the need arises.

Dec 20 '24 14:12 vruello

@vruello Sorry for the huge delay. I think I've resolved all review notes.

Mar 19 '25 15:03 MrAnno

@vruello If you have some free time, could you take a look, please? :)

Apr 19 '25 23:04 MrAnno

We'll integrate this PR into our fork the upcoming week and start using/testing it.

Jun 13 '25 13:06 MrAnno

@vruello Do you think there's a chance this could be merged upstream?

Oct 07 '25 18:10 MrAnno

Hi! I squashed your commits, rebased on upstream and made some changes. The most significant change is replacing the filtering types KerberosPrinc and TLSCertSubject with Client, meaning that the filter will be applied to the client identifier, which depends on the authentication method used.

I need some time to review it one last time but it should be ready for merging soon.

Oct 12 '25 17:10 vruello

PrincsFilter to ClientFilter reimplementation

Client filter

Filtering modes:

Filtering types:

Filtering flags:

cert_subjects alias

glob patterns

MachineID filter

Filter rework

`cert_subjects` alias

`MachineID` filter