intelmq icon indicating copy to clipboard operation
intelmq copied to clipboard

Add Microsoft Defender bots

Open creideiki opened this issue 4 years ago • 14 comments

Fetches security alerts from Microsoft Defender ATP.

This bot is quite a mess in its present state - it horribly abuses the extra. namespace in messages, it hard-codes processing decisions, the error handling is not great, and it cannot do OAuth2 token refresh, so it needs to re-authenticate for each call. It's also only written for IntelMQ v2.3.1 currently.

Still, it does what it's supposed to do, and we think that's useful. We're currently using it together with a custom output bot to create tickets in our helpdesk system whenever Defender finds malware on a client.

Do you think it would be worthwhile to try to beat this code into shape for upstream inclusion? Or is this a non-starter because of e.g. our need to send lots of information that doesn't fit in the default harmonisation, and thus stuffing the extra. namespace full of JSON?

creideiki avatar Apr 27 '21 15:04 creideiki

Cool addition!

It may not be the primary use-case of IntelMQ, but if it's useful and it does not conflict with IntelMQs principles, I see no obstacle to include it.

On the oauth issue: Well, if it works, it's already better than having nothing^^

ghost avatar Apr 29 '21 09:04 ghost

Great, thanks! I'll make a note of updating this pull request once we've migrated this bot to 3.0 and cleaned it up a bit.

creideiki avatar Apr 29 '21 12:04 creideiki

Great, thanks! I'll make a note of updating this pull request once we've migrated this bot to 3.0 and cleaned it up a bit.

Cool, thanks in advance!

ghost avatar Apr 29 '21 14:04 ghost

Codecov Report

Merging #1910 (c4ece5a) into develop (5962fa9) will increase coverage by 0.42%. The diff coverage is 87.06%.

@@             Coverage Diff             @@
##           develop    #1910      +/-   ##
===========================================
+ Coverage    75.90%   76.33%   +0.42%     
===========================================
  Files          440      452      +12     
  Lines        23573    24501     +928     
  Branches      3150     3240      +90     
===========================================
+ Hits         17894    18703     +809     
- Misses        4947     5020      +73     
- Partials       732      778      +46     
Impacted Files Coverage Δ
...q/bots/experts/defender_advanced_hunting/expert.py 64.03% <64.03%> (ø)
intelmq/bots/outputs/defender_comment/output.py 65.67% <65.67%> (ø)
...lmq/bots/collectors/defender/collector_defender.py 67.14% <67.14%> (ø)
intelmq/bots/experts/defender_file/expert.py 71.91% <71.91%> (ø)
...q/tests/bots/collectors/defender/test_collector.py 94.82% <94.82%> (ø)
intelmq/bots/parsers/defender/parser.py 95.00% <95.00%> (ø)
...mq/tests/bots/experts/defender_file/test_expert.py 98.00% <98.00%> (ø)
intelmq/bots/experts/defender_to_text/expert.py 100.00% <100.00%> (ø)
...s/experts/defender_advanced_hunting/test_expert.py 100.00% <100.00%> (ø)
...tests/bots/experts/defender_to_text/test_expert.py 100.00% <100.00%> (ø)
... and 3 more

codecov-commenter avatar May 18 '21 07:05 codecov-commenter

Pushed a new and improved design, now sort of conforming a bit more to the rest of IntelMQ.

Split the horrible hardcoded monolithic mess of a collector that basically only worked for our use case into four bots:

  1. collector
  2. parser
  3. file expert (asks the Defender cloud for information on a file, searched by SHA1 hash)
  4. advanced hunting expert (sends arbitary queries to the Defender cloud and updates the event with the results, somewhat based on my earlier Splunk saved search expert).

With simple unit tests for all. No documentation written yet, though.

creideiki avatar May 28 '21 09:05 creideiki

I have force-pushed a new version of this set of bots, which is what we have been running internally for a while. They should be ready for review now.

This pull request contains six different bots, all having to do with communicating with Microsoft Defender's API:

  1. Collector, for retrieving malware alerts from the Defender cloud service.
  2. Parser.
  3. Text formatting expert. In case the alert wasn't an exact match for something we know how to automate, this bot formats the alert to text in order to send it via e-mail (using the templated SMTP output bot) for manual processing.
  4. Advanced hunting expert, which can run arbitrary queries against the Defender API. We use it to enrich the initial alert by querying for logged in users at the time of the alert.
  5. File information expert. This takes the SHA1 or SHA256 hash of the file included in the initial alert, and fetches all information about it, e.g. signature (if it's a signed executable) and earliest detection time.

After that, our botnet has a bot (which is not included here, because it is very specific to us) for creating or updating an issue in our ticketing system. The final bot is:

  1. Comment output. This updates the initial Defender alert with a comment; we add our internal ticket number.

creideiki avatar Oct 26 '21 13:10 creideiki

Thanks for the thorough pull request! I will have a look at it next week.

ghost avatar Oct 29 '21 14:10 ghost

Someone needs to review the code, see https://lists.cert.at/pipermail/intelmq-dev/2021-November/000558.html

sebix avatar Nov 18 '21 17:11 sebix

Thanks for taking a look at this. Unfortunately, there are some weird problems with the API right now which in the worst case may force a redesign.

When we started using the Defender API, it behaved as documented, and we ran the collector every 600 seconds asking for all new alerts from the last 605 seconds. This worked with the code in this pull request.

A few days ago, the collector stopped returning results. After experimentation, we discovered that the API endpoint that searches for the newest alerts only returned alerts that were more than 6 hours old. We added code to the comment output (not yet submitted here) to add a static string as a comment, and changed our botnet to always run the comment output bot as soon as an alert came in. We also added code to the parser (also not submitted yet) to ignore alerts with this comment set. This meant that we could increase the lookback time in the collector to more than 6 hours, thus getting alerts again.

Yesterday, the API's behaviour changed again. The search endpoint now only has a delay of 30 minutes, but the delay now applies not only to new alerts, but also to new comments. That means that even when we add comments to alerts saying the botnet has already seen them, when we search for alerts we don't see that comment until 30 minutes later. This meant that alerts we had already handled looked new and went through the botnet again. To compensate, we increased the collector rate limit to 3600 seconds (1 hour).

While wrestling with these problems, we found out that while the API we are using is still documented and not deprecated, there is a new API that is now recommended instead.

We are currently waiting to hear back from Microsoft support regarding the strange delays on the old API, and whether that API will be deprecated and we will have to move to the new API.

creideiki avatar Nov 19 '21 08:11 creideiki

Phew, that sounds annoying.

sebix avatar Nov 19 '21 09:11 sebix

Dear @creideiki,

Did the product's API stabilize and have you been able to get it working?

sebix avatar Jul 05 '22 12:07 sebix

We have not heard anything more from Microsoft. The old API used here is still documented, the new API doesn't supply all the information the old one does (and that we need), and we are still running these bots in production, with the horrible hacks described above to work around the fact that the old API only returns events after several hours' delay.

creideiki avatar Jul 06 '22 06:07 creideiki

OK, that the code is already as stable as it can currently be, right? And additionally, it is thoroughly tested in production by yours?

sebix avatar Jul 06 '22 07:07 sebix

That does seem to be the case, at least when the botnet is configured to work around the deficiencies. That configuration would need to be documented. And we have a few changes in production to support that workaround that are not included in this pull request, because I had hoped they would not be required long-term.

But if you think this functionality would be useful, I could incorporate those code changes and write that documentation.

creideiki avatar Jul 06 '22 12:07 creideiki

There is now a third API, which does seem to contain all the information we want to use, but is in beta and explicitly not for production use: https://learn.microsoft.com/en-us/graph/api/security-list-alerts_v2

I'm starting to think that it's impossible to write a generic module to talk to Microsoft's APIs. They change so often that every API consumer has to be a developer to keep up.

creideiki avatar Oct 28 '22 08:10 creideiki

Oh dear that's complex.

sebix avatar Nov 01 '22 20:11 sebix