IntelOwl
IntelOwl copied to clipboard
IntelOwl 3.1 Florian Roth Yara Scan Fails
Problem: Florian Roth Yara scanner is broken in IntelOwl v3.1.0
Yara_Scan_Florian
analyzer always fails. This applies when the module is selected as part of a batch operation (multiple scanners) and when the module is ran by itself.
intelowl_celery_worker_default
container logs
[2021-10-15 20:29:29,078: INFO/ForkPoolWorker-1] STARTED analyzer: (Yara_Scan_Florian, job_id: #1) -> File: (SOMETOOL.exe, md5: REDACTED)
[2021-10-15 20:29:34,560: ERROR/MainProcess] Process 'ForkPoolWorker-1' pid:12 exited with 'signal 9 (SIGKILL)'
[2021-10-15 20:29:35,072: ERROR/MainProcess] Chord '6d11f10a-2d1e-4f61-be0f-87fae8ef889a' raised: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL) Job: 1.')
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/django_celery_results/backends/database.py", line 223, in trigger_callback
ret = j(timeout=app.conf.result_chord_join_timeout, propagate=True)
File "/usr/local/lib/python3.9/site-packages/celery/result.py", line 746, in join
value = result.get(
File "/usr/local/lib/python3.9/site-packages/celery/result.py", line 219, in get
self.maybe_throw(callback=callback)
File "/usr/local/lib/python3.9/site-packages/celery/result.py", line 335, in maybe_throw
self.throw(value, self._to_remote_traceback(tb))
File "/usr/local/lib/python3.9/site-packages/celery/result.py", line 328, in throw
self.on_ready.throw(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/vine/promises.py", line 234, in throw
reraise(type(exc), exc, tb)
File "/usr/local/lib/python3.9/site-packages/vine/utils.py", line 30, in reraise
raise value
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 1.
[2021-10-15 20:29:35,090: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL) Job: 1.')
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 1.
Yara_Scan_Florian
from analyzer_config.json
"Yara_Scan_Florian": {
"type": "file",
"python_module": "yara_scan.YaraScan",
"description": "scan a file with Neo23x0 yara rules",
"disabled": false,
"external_service": false,
"leaks_info": false,
"config": {
"soft_time_limit": 60,
"queue": "default"
},
"secrets": {},
"params": {
"git_repo_main_dir": {
"value": [
"/opt/deploy/yara/signature-base"
],
"type": "list",
"description": ""
},
"directories_with_rules": {
"value": [
"/opt/deploy/yara/signature-base/yara"
],
"type": "list",
"description": ""
}
}
},
Hey, thank you for the report.
I do not think this is related to IntelOwl itself. We have no experience of this error in production systems for that analyzer and we were not able to replicate it. The problem seems to be related to the system where IntelOwl runs.
Worker exited prematurely: signal 9 (SIGKILL)
: this probably means there is another process killing celery, most probably due to OOM issues. We have experienced several memory issues with celery.
Can you share the resources (CPU, RAM) of your machines? Is it a server dedicated to this application or you run it on your local machine for instance?
currently running on VBOX guest HOST OS: Mac OSX
- Guest OS: Ubuntu
- Guest CPU: 4
- Guest RAM: 8GB
Well, I am being honest, I am afraid that the memory required to execute all the file analyzers at the same time is too much for even a system like that. ATM I can't help you in any other way than tell you: if you want to run all the analyzers, you should have more RAM.
In all our efforts to add a lot of analyzers, we didn't make particular tests regarding the amount of computational resources required for huge loads. We'll start to do it and find a way out to this problem in the most reasonable way.
so here is the thing.. if i run all of the analyzers (36 of them enabled) they all run and work perfectly fine except the Florian yara scanner. i end up with 35/36 successes and end up having to kill the Florian scan... even if i just run the Florian scan by itself without any of the other scanners, i get the same errors.
ah ok that is really strange...let us do some other tests and see if we can replicate this because we tried today and we could not
i appreciate your help :)
👍🏻
can you also share the size of the analyzed file? This issue appears with also other files? Can you try with a little file?
yup tested with Seatbelt.exe (516.kb) and test.txt (6b) .. both failed.
the reason this is happening is because the YaraScan
analyzer is separately compiling each yara file and holding on to each yara.Rules
object and running each one separately (the florian set has > 500 files)
if you combine all of the "valid" rules into a single file such that YaraScan
only sees that one file, it does not hit those memory limits
(in fact this is what they do here https://github.com/Neo23x0/signature-base/blob/master/build-rules.py .. they test-compile each yara file before appending it to a string which is compiled at the end)
thank you for you help.
I am not sure that this will solve the problem because you still need to load into memory all the rules at once. So even in your case there will be a moment when all the rules are loaded into memory and could crash the application. Plus, in this way, we would lose the reference to the original yara file that is useful when you need to look for the rule definition once a rule has triggered.
On the contrary, imho we could just change the code to call rules.match(self.filepath)
just after each compiled file and not after we have compiled/loaded all the rules.
In this way, we would just have a single yara file in the memory at once instead of keeping them all until the end.
you could also just combine all of the rules into an index.yar
in repo_downloader.sh
find yara -name \*.yar | while read yarafile; do
yarac -d filename=XXX $yarafile /dev/null && cat $yarafile >> index.yar
done
Can we compile rules in batches and priority? like a .exe file would load common .exe yara rules first, a Linux binary would load Linux yara rules. we can use the file <file>
command for checking file types.
After completing a scan for a batch we destroy the instance.
Yara was reworked and rules are compiled in advance. Considering this addressed until further notice