opencti icon indicating copy to clipboard operation
opencti copied to clipboard

Limit stix_ids explosion by rewriting the standard_id in client python

Open richard-julien opened this issue 1 year ago • 3 comments

Use case

Limit stix_ids explosion by rewriting the standard_id in client python.

Context

As described in https://github.com/OpenCTI-Platform/opencti/pull/6774 by @ckane , because of remote system or bad designed connectors, a lot of stix ids can be cumulated in an element.

This seems to be able to result in a long sequence of entries in redis, during re-ingestion of the same entity (such as a malware STIX type) with new randomly-assigned STIX ids. My observation is that these get added to the stream once for each new STIX id encountered for the same item, and the list of stix_ids in the added/updated entity can grow infinitely. This is the source of one of the reasons why redis memory consumption can grow so high, under some circumstances, despite implementing a low TRIMMING size (such as 100000).

Solution

We need to generate the standard id on the client python and replace dynamically all the ids with the predicted ones. By default, drop the base id of the source but add an option to configure in the connector if you want to consolidate them.

  • a cleanup script for existing platform to remove all the bad stix_ids already written

See https://github.com/OpenCTI-Platform/client-python/issues/659

Why not only a task in client python?

Its also a good moment to check and enforce the ids generation on opencti side.

richard-julien avatar May 28 '24 16:05 richard-julien

Thank you!

ckane avatar May 28 '24 16:05 ckane

After massive testing its appears that this approach cannot be applied. In a lot of different circumstances the bundle we received cannot be "complete" In case of incomplete bundle, rewrinting ids is not possible as dependencies elements cannot be found in the bundle. As this approach cannot work, the alternative is to be able to control that the usage of the STIX2 library always generate an id. To control that, a custom linting rule seems to be a good approach. Current dev can be tracked here https://github.com/OpenCTI-Platform/connectors/pull/2786

richard-julien avatar Oct 14 '24 08:10 richard-julien

Thanks @richard-julien - this sounds like a great approach

ckane avatar Oct 15 '24 14:10 ckane

this issue will be closed when client-python and opencti PRs are merged

labo-flg avatar Oct 30 '24 10:10 labo-flg