rspamd.com Please document a working regexp example

I checked the regexp module page, and could not make a working .conf file.

Specifically, I found this in the code:

reconf['MICROSOFT_SPAM'] = {
  -- https://technet.microsoft.com/en-us/library/dn205071(v=exchg.150).aspx
  re = 'X-Forefront-Antispam-Report=/SFV:SPM/H',
  score = 4.0,
  description = "Microsoft says the message is spam",
  group = 'upstream_spam_filters'
}

And wanted an expression like:

re = 'X-Forefront-Antispam-Report=/SFV:SPM/iH'

But the regexp page speaks only about regexp, and Internal functions, but not how to use them.

Which internal function do we call to say "yup, definitely spam, drop this shit"? Why perform all those binary checks (internal functions) if the regexp itself is the check we need?

Please show an example (and document it) that can go in local.d/regexp.conf - Ideally one that will immediately a) learn spam and reject or b) drop or discard

Today, with milter-regex, the syntax there is clear, e.g.:

discard
header /^X-Microsoft-Antispam$/i /.*BCL\:[1-9]*/i

discard
header /^X-Forefront-Antispam-Report$/i /.*SFV\:SPM.*/i

Nov 07 '19 18:11 systemcrash

@systemcrash I needed to compile a regexp rule recently, and also struggled to figure out the regexp module. Eventually I got something working. Below is the content of my local.d/regexp.conf file, hope it helps.

"RE_SEXTORTION" = {
	re = '/your/{words} && /password/{words} && /buy/{words} && /bitcoin/{words}';
	score = 15.0;
}

Apr 22 '20 13:04 jmptbl

What field were you filtering on and what was the typical content?

The module isn't the easiest to use, I have to admit...

On Wed, 22 Apr 2020 at 15:28, Aragon Gouveia [email protected] wrote:

@systemcrash https://github.com/systemcrash I needed to compile a regexp rule recently, and also struggled to figure out the regexp module. Eventually I got something working. Below is the content of my local.d/regexp.conf file, hope it helps.

"RE_SEXTORTION" = { re = '/your/{words} && /password/{words} && /buy/{words} && /bitcoin/{words}'; score = 15.0; }

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rspamd/rspamd.com/issues/423#issuecomment-617778969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE6DUJGEHZD3OMBP4AQ2ILRN3WH5ANCNFSM4JKLW37Q .

Apr 22 '20 13:04 systemcrash

It filters on the {words} type, which is a transformation on the message body documented as follows:

Unicode normalized (to NFKC) and lowercased words extracted from the text (excluding URLs), subject and From displayed name

The content was sextortion type emails that I was given as examples. They were sneakily encoded with strange UTF-8 character sequences, so {words} and the regexp patterns I gave seemed good enough given the size and type of the user base in question.

Apr 22 '20 14:04 jmptbl