deduplicate command sequences
from the internal logs, it seems that same command sequences are stored multiple times. to investigate
Would you attach or send me the logs?
sorry for the late answer, I am AFK these days, I'll get back as soon as I can on this
Tim, when I checked that I noticed that there were some identical commands here:
They are dated March so I am not sure whether this was fixed. In any case, I did a little change here fb9aeeadba9987dcec9b3839fa79ea7f23d7743c and I'll test out whether this does not happen anymore.
I also added some logs because I noticed that there were no any recent commands sequences collected in the last months so it has been weird to me.
They are dated March so I am not sure whether this was fixed.
No, I do not think it is fixed.
The model definition of the command sequences does not allow duplicates, because the commands_hash has a unique constraint:
class CommandSequence(models.Model):
first_seen = models.DateTimeField(blank=False, default=datetime.now)
last_seen = models.DateTimeField(blank=False, default=datetime.now)
commands = pg_fields.ArrayField(models.CharField(max_length=1024, blank=True), blank=False, null=False, default=list)
commands_hash = models.CharField(max_length=64, unique=True, blank=True, null=True)
cluster = models.IntegerField(blank=True, null=True)
It would be interesting to see the content of the commands_hash field for two of the duplicate sequences. Would you provide this information?
it is empty. My speculation is that the deduplication does not happen sometimes because we dont always have the closing event from cowrie and the code did the deduplication there. I have just deployed a little change about that and I'll monitor it in the next days and update here
My speculation is that the deduplication does not happen sometimes because we dont always have the closing event from cowrie and the code did the deduplication there.
Yes, that sounds plausible. Then the question is why do we miss the closing event? That shouldn't happen. You can check if a closing event was recorded for a particular session, if the session has a duration property. On my private instance I do not see any session without a closing event.
I guess you are right, I wanted to experiment with the honeynet servers which seem to be sometimes faulty. I still can't have access to Kibana but this problem needs to be definitevely solved otherwise I can't debug further
Maybe we could add a cleanup routine that deletes command sequences that are older than 1 day but do not have a commands_hash value (which means that they do not have the closing event from cowrie).
theoretically with the change implemented here, that should not happen.