Refactor analyzers which download external dbs for local queries
There are several analyzers like Tor, Maxmind, JA4DB and so on, that download an external db and update it every once in a while. The problem is that they store it as a local file and the analyzers parse those files in search of a specific entry. It would make sense to store them as additional tables in the database and to query the data directly there.
Hi @mlodic Can I take this up? It will be a nice challenge.
I plan to implement this for JA4DB first and upon successful implementation I'll subsequently implement this for other analyzers. This might take some time for me to implement as I need to understand Django models a bit more and how everything works in intelowl, like how everything is connected.
If there are any doubts, shall I ask it here or in Slack channel?
sure, feel free to ask where you want
Hi @mlodic While I was running the refactored JA4_DB to update the api data to DB, I was getting the error "NotImplementedError". Unable to find data model for generic.
I want to understand what is the purpose of data_models_manager and why is this error coming up because the json report returned from the analyzer is successfully populated in "analyzers_manager_analyzerreport" table.
Am i missing something? Do I need to create a generic data model and a corresponding serializer which will be used by the analyzers, we are going to refactor?
Would really appreciate your inputs on this since this would be a crucial information to understand which will help me in solving this issue.
thanks for sharing, I'll ping you towards the guys who created the data models which are a fresh new feature added to the last release.
I know that data models were not added to the generic observables so I would like to ask you to share additional information about that error. Maybe full stack trace or screenshots of the analyzed observable and when you encounter the error. Ty
@cristinaascari is reviewing the issue
Thanks @mlodic & @cristinaascari. Here is the link to the error details.
Refactored JA4_DB.py analyzer
def update(cls):
logger.info(f"Updating table from {cls.url}")
response = requests.get(url=cls.url)
response.raise_for_status()
data = response.json()
if JA4Fingerprint.objects.count() != 0:
with connection.cursor() as cursor:
cursor.execute("TRUNCATE TABLE analyzers_manager_ja4fingerprint RESTART IDENTITY CASCADE;")
instances = [JA4Fingerprint(**item) for item in data]
JA4Fingerprint.objects.bulk_create(instances)
logger.info(f"Table updated")
def run(self):
reason = self.check_ja4_fingerprint(self.observable_name)
if reason:
return {"not_supported": reason}
if JA4Fingerprint.objects.count() == 0:
logger.info(
f"Table does not exist, initialising..."
)
self.update()
application = JA4Fingerprint.objects.filter(ja4_fingerprint=self.observable_name).values().first()
if application:
return dict(application)
return {'found': False}
Corresponding JA4Fingerprint Model:
class JA4Fingerprint(models.Model):
application = models.CharField(max_length=255, null=True, blank=True)
library = models.CharField(max_length=255, null=True, blank=True)
device = models.CharField(max_length=255, null=True, blank=True)
os = models.CharField(max_length=255, null=True, blank=True)
user_agent_string = models.TextField(null=True, blank=True)
certificate_authority = models.CharField(max_length=255, null=True, blank=True)
observation_count = models.PositiveIntegerField(default=1)
verified = models.BooleanField(default=False)
notes = models.TextField(null=True, blank=True)
ja4_fingerprint = models.CharField(max_length=255, null=True, blank=True)
ja4_fingerprint_string = models.TextField(null=True, blank=True)
ja4s_fingerprint = models.CharField(max_length=255, null=True, blank=True)
ja4h_fingerprint = models.CharField(max_length=255, null=True, blank=True)
ja4x_fingerprint = models.CharField(max_length=255, null=True, blank=True)
ja4t_fingerprint = models.CharField(max_length=255, null=True, blank=True)
ja4ts_fingerprint = models.CharField(max_length=255, null=True, blank=True)
ja4tscan_fingerprint = models.CharField(max_length=255, null=True, blank=True)
https://github.com/intelowlproject/IntelOwl/pull/2662 this fix has just been merged. Can you please update your fork and try again? thanks
#2662 this fix has just been merged. Can you please update your fork and try again? thanks
Just now I was seeing the PR merged by @cristinaascari. Will pull the latest code and try again. Asking just out of curiousity, why are we just returning an empty list for GENERIC type rather than implementing a proper data model. :)
@cristinaascari has been testing an organized solution for a while, we'll wait for her PR to fix this.
This issue has been marked as stale because it has had no activity for 10 days. If you are still working on this, please provide some updates.