Please document a working regexp example
I checked the regexp module page, and could not make a working .conf file.
Specifically, I found this in the code:
reconf['MICROSOFT_SPAM'] = {
-- https://technet.microsoft.com/en-us/library/dn205071(v=exchg.150).aspx
re = 'X-Forefront-Antispam-Report=/SFV:SPM/H',
score = 4.0,
description = "Microsoft says the message is spam",
group = 'upstream_spam_filters'
}
And wanted an expression like:
re = 'X-Forefront-Antispam-Report=/SFV:SPM/iH'
But the regexp page speaks only about regexp, and Internal functions, but not how to use them.
Which internal function do we call to say "yup, definitely spam, drop this shit"? Why perform all those binary checks (internal functions) if the regexp itself is the check we need?
Please show an example (and document it) that can go in local.d/regexp.conf - Ideally one that will immediately a) learn spam and reject or b) drop or discard
Today, with milter-regex, the syntax there is clear, e.g.:
discard
header /^X-Microsoft-Antispam$/i /.*BCL\:[1-9]*/i
discard
header /^X-Forefront-Antispam-Report$/i /.*SFV\:SPM.*/i
@systemcrash I needed to compile a regexp rule recently, and also struggled to figure out the regexp module. Eventually I got something working. Below is the content of my local.d/regexp.conf file, hope it helps.
"RE_SEXTORTION" = {
re = '/your/{words} && /password/{words} && /buy/{words} && /bitcoin/{words}';
score = 15.0;
}
What field were you filtering on and what was the typical content?
The module isn't the easiest to use, I have to admit...
On Wed, 22 Apr 2020 at 15:28, Aragon Gouveia [email protected] wrote:
@systemcrash https://github.com/systemcrash I needed to compile a regexp rule recently, and also struggled to figure out the regexp module. Eventually I got something working. Below is the content of my local.d/regexp.conf file, hope it helps.
"RE_SEXTORTION" = { re = '/your/{words} && /password/{words} && /buy/{words} && /bitcoin/{words}'; score = 15.0; }
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rspamd/rspamd.com/issues/423#issuecomment-617778969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE6DUJGEHZD3OMBP4AQ2ILRN3WH5ANCNFSM4JKLW37Q .
It filters on the {words} type, which is a transformation on the message body documented as follows:
Unicode normalized (to NFKC) and lowercased words extracted from the text (excluding URLs), subject and From displayed name
The content was sextortion type emails that I was given as examples. They were sneakily encoded with strange UTF-8 character sequences, so {words} and the regexp patterns I gave seemed good enough given the size and type of the user base in question.