GreedyBear
GreedyBear copied to clipboard
Rework extraction process
The process of extracting data from T-Pot and writing into our database is one of the most important parts of GreedyBear. However it has some problems:
- it is not very well testable: many functions and methods directly depend on the presence of elastic stack and/or of the the GreedyBear database which makes them hard to test when these data sources are missing
- classes like
ExtractAttackshave many different responsibilities and should be split up into separate service classes - the special treatment of Log4j and Cowrie is deeply baked into the process (although Log4j is not that relevant anymore)
I am currently working on an improved process following some best practices:
- repository pattern: repository pattern handle data access without containing any processing logic
- single responsibility: every class in the process has one clear and recognizable responsibility
- dependency injection: dependencies are injected through constructors which makes testing much easier
- strategy pattern: makes it easier to add new "special treatment" for honeypots
I will open a PR soon which contains the most profound changes to the logic. After that code is merged, I would also like to streamline the Cowrie extraction process and add end to end pipeline tests. But I'll open separate issues for that.