Limit number of correlations
Is there a way to limit the number of correlations?
I am aware of the display limit, but I am more interested in a database limit per attribute/per event.
I am asking this because we reach the maximum int number (which is not even unsigned int) quite fast and here I am referring to the maximum id number.
Wow. That sounds weird that you reach it, though it is possible - could you describe how you are using your MISP? There are several ways to reduce the number of correlations but for that we need more information.
Also, could it be that you are pulling feeds frequently in different event each time? That would lead your attributes to cross correlate everytime, leading to nearly a n² number of correlations.
We are currently ingesting data from two feed sources and 1 sync MISP server, and we have somewhere around 200k events, each event having somewhere around at 10-20 attributes in multiple objects.
There are some attributes which we should not correlate and we will disable correlations for those ones, but what we are looking for is a way for example to have a limit of X correlations for one attribute/object/event, which would be really handy. For example we have an object called C&C Center and it has an attribute called IP which has 15k correlations. We would like it to limit it to 500 for example. We can do that by displaying only 500 in graph, but we need to have only 500 in database.
I hope this is clear, if not I can go more into details.
We are pulling events really fast, in terms of storage more than 1 TB, even if we have conditions like don't pull events older than 90 days, we are still facing the issue.
In terms of attributes we have like maybe somewhere around 100-150 attributes but correlations we have only for 20 maybe?
We pull only unique events for sure, problem is that events have common attributes based on "category" like for examples botnets will always have the same c&c center if it is the same botnet.. I hope you get the idea.
Any updates?
I don't think restricting the number of correlations is easily solvable. If you decide to keep only X correlations, how do you keep the ones that are interesting or relevant? In the current state, you can either provide more resources to your system or disable correlations.
@mokaddem thanks for your response, but that's not a professional response which I was expecting, maybe somebody with more knowledge can give a better one. I appreciate it anyway.
To give you a lil bit of background, we already found a solution how to do it via mysql triggers, but we don't like the idea to touch a product without upstream review or opinion. Also to answer your question we do that limitation based on last events (time based).
Providing more resources or disable correlations are no solutions for us for several reasons, mainly because we want to fix the problem not to find workarounds for it and to encounter it again later.
Also reached max int number on correlations. Wasn't too difficult with default feeds on for all vendors after a couple years. Any update on this?