rock icon indicating copy to clipboard operation
rock copied to clipboard

Data Management with RockNSM

Open webhead404 opened this issue 7 years ago • 5 comments

It doesn't appear that there is any documentation on how to manage data in RockNSM as far as disk utilization goes. For instance in my lab after about a week, Kibana is unresponsive. Probably because ES quit indexing new data. Is there a form of log rotation can that be accomplished or data retention settings somewhere?

webhead404 avatar Oct 04 '18 15:10 webhead404

This has come up in a couple of conversations recently. The near-term fix is to use separate mount points for products that consume a lot of data, which fixes the issue with ES going read-only when Kafka fills up.

Long term, we will probably implement disk quotas to help out with this.

bndabbs avatar Jan 30 '19 14:01 bndabbs

when Kafka fills up will it overwrite the old data ?

yasser48 avatar Apr 17 '19 09:04 yasser48

Kafka defaults to aging off old data after 168 hours, which is 7 days. You can override this to a shorter period by setting kafka_retention in your config.yml. This value is set in the Kafka role. There are some other manual ways to set this (see the Kafka docs upstream for retention), but in terms of hours is the easiest and the default method used by the upstream project.

dcode avatar Apr 17 '19 11:04 dcode

thank you , still have some questions to clarify data retention example below suricata --> kafka -->ES Bro --> kafka -->ES

which one is happening 1 or 2 1- data log "A" moves from surciata/bro --> kafak --> ES
2- data log "A" moves from surciata/bro --> kafak and stays there <-- ES read directly from kafka

how the data is stored in the process what will happened to log "A" when it moves from bro/suricata to kafka ,,etc , will it have the same packet duplicated in both Bro and Kafak or only one ?

because we did test it in heavy traffic environment and we had issues with size and we need to understand how data log is managed between components.

yasser48 avatar Apr 17 '19 11:04 yasser48

Data from bro is written directly to Kafka in json format. Bro also writes to disk in the classic ascii format. If you're running a higher bandwidth sensor, I recommend disabling the ascii logs, as they eat a lot of space and are too large to grep through. We have a script to do this for you, just add the following line to your local.bro:

@load ./rock/frameworks/logging/disable-ascii

Suricata logs are currently shipped to Kafka using filebeat. We're working on a solution that won't write to disk at all, but it's not ready for the community yet. You can save some space by turning off the fast log and unified2 log. We only ingest the eve.json.

Once everything is in Kafka, Logstash consumes it and transform/enriches and ships to Elasticsearch.

If you're using stenographer, it will age off old pcap data by itself when the disk reaches 80% I think. That's customizable too.

dcode avatar Apr 17 '19 13:04 dcode