intelmq smtp_batch: add feature for grouping and templating

Like discussed in #2586, here is the thing. I already did some manual testing regarding templating the subject string. For this configured

add extra.my_template_value to the events (I did this with the modify bot for now)
add that key (extra.my_template_value) to the new additional_grouping_keys parameter
use subject: "Prefix {{ extra_my_template_value }}"
activate the templating capabilites for the subject line with templating: {subject: true}

But probably some tests in the test suite will be needed for this (but I'd like to defer this one after the first feedback round)

Any comments?

Maybe issues to discuss:

how values get hashed generically with hash_arbitrary
passing the template_data along with each mail as a tuple
really a warning if jinja2 couldn't be imported? (the user might not want jinja2) -- also whether to add this optional requirement to the requirements.txt?

~Also I didn't get the testing environment setup correctly until now, so it would be nice if someone could trigger the pipelines yet already so I can check the outputs regarding code-style.~

Apr 29 '25 15:04 Lukas-Heindl

Feel free to have a look, but it will be some time (estimate: 1-2 weeks) until I can have a look at the failing pipelines. My way to go would be

looking into what makes them fail
- code style should pass obviously
- failing tests here might indicate a not fully backwards compatible change
feedback (of course any time, also before/while 1.)
Write additional tests covering the new features

Apr 29 '25 15:04 Lukas-Heindl

Finally the tests are working. (Sorry for the frequent pushes)

So now from my point of view only some tests for the new features are missing. I'd like to keep the old/current tests as "basic" config with the "basic" set of features and one new test (different config is needed for this anyhow) for the more advanced features (one run which includes the new grouping and the templating).

May 06 '25 12:05 Lukas-Heindl

The test should be easily fixed now. But I noticed another issue (for which we do not have a test yet: https://github.com/certtools/intelmq/blob/2f15e20422d350f6ac4cfa641d89712f7f306d7c/intelmq/bots/outputs/smtp_batch/output.py#L77-L78 When source.abuse_contact is multiple addesses it is important to use the address stored in the key and not the soruce.abuse_contact field.

The fix I thought about would be to simply add a number to the redis key serving as index which indicates which address of the source.abuse_contact field should be used in case there are multiple ones.

Any other ideas / comments on this approach?

May 08 '25 06:05 Lukas-Heindl

Ok I noted at least with the default harmonization.yml, source.abuse_contact shouldn't be a list. Still the old version also handled the list case. What do you think should we continue supporting that case? (and if yes what do you think about my current approach?)

Apart from that issue this is finished from my point of view (maybe cleanup the commit history even more). Any comments / changes required?

May 09 '25 13:05 Lukas-Heindl

(sorry I'm late again, trying to get to you in the next days)

May 14 '25 14:05 e3rd

Whoa, it took me so long. Thanks for having investigated so much but let's stop supporting list and lets merge!

Jul 17 '25 10:07 e3rd

Whoa, it took me so long. Thanks for having investigated so much but let's stop supporting list and lets merge!

Thank you for the review, I rebased on develop, added changelog entries and will merge after the checks

Jul 18 '25 11:07 sebix