TheHive [Bug] - high CPU consumption

Request Type

Bug

Work Environment

Question	Answer
OS version (server)	Oracle Linux
OS version (client)	Windows 10
Virtualized Env.	False
Dedicated RAM	64 GB
vCPU	20
TheHive version / git hash	4.1.16
Package Type	RPM
Database	BerlkelyDB
Index type	Lucene
Attachments storage	Local
Browser type & version	Chrome

Problem Description

Our team has a high volume of alerts, which are opened via API in TheHive, we also create several automations, to merge alerts into cases, so API searches are also constant.

We have a total of 15 analysts accessing the platform simultaneously, and at times TheHive consumes all the server's CPUs, and the platform is inaccessible, until I terminate the TheHive process via kill and start the service again.

Steps to Reproduce

I noticed that when merging, from alerts to Cases, it tends to consume a lot of server CPU, and this is something that analysts use constantly.

But I have no proof that this is really the real problem.

Jan 08 '22 14:01 andreyglauzer

I am having the same issue with my instance of the hive

Jan 14 '22 11:01 edwardrixon

My org also experiences this issue. Similarly spec'd system. (14 vCPU, 64 GB)

Jan 19 '22 14:01 MDB4241

Also here when access case or close it. Usually we got 60-70 observables per case with a total of 11k case (~100 open). I'm thinking about the features that check the related case by observable, can be an option?

Jan 25 '22 16:01 backloop-biz

I did a job of migrating the database from BerlkelyDB to Cassandra, I'm having great results.

I'll leave but a few weeks of testing, and I'll return

Jan 25 '22 16:01 andreyglauzer

Also here when access case or close it. Usually we got 60-70 observables per case with a total of 11k case (~100 open). I'm thinking about the features that check the related case by observable, can be an option?

You should see a performance increase if you use the 'ignoreSimilarity' option on non-critical case artifacts. This was a useful edit for my organization and it reduces the impact of rendering a case.

I did a job of migrating the database from BerlkelyDB to Cassandra, I'm having great results.

I'll leave but a few weeks of testing, and I'll return

Glad you've found a potential resolution! Unfortunately, we are already on Cassandra :(

We've even increased resources to 40 vCPU to troubleshoot and the problem persists. We have a single host with TheHive, Cortex, Cassandra & ElasticSearch. Perhaps separating these services into dedicated hosts will yield better performance.

Jan 25 '22 17:01 MDB4241

Also here when access case or close it. Usually we got 60-70 observables per case with a total of 11k case (~100 open). I'm thinking about the features that check the related case by observable, can be an option?

You should see a performance increase if you use the 'ignoreSimilarity' option on non-critical case artifacts. This was a useful edit for my organization and it reduces the impact of rendering a case.

I did a job of migrating the database from BerlkelyDB to Cassandra, I'm having great results. I'll leave but a few weeks of testing, and I'll return

Glad you've found a potential resolution! Unfortunately, we are already on Cassandra :(

We've even increased resources to 40 vCPU to troubleshoot and the problem persists. We have a single host with TheHive, Cortex, Cassandra & ElasticSearch. Perhaps separating these services into dedicated hosts will yield better performance.

I've noticed some analysts using "stats" which generates a time-consuming and frequently used search.

I removed this option on the frontend.

One thing I've also noticed, is large descriptions, this requires a lot from the server, I believe in the conversion. We are avoiding very long descriptions

Jan 25 '22 17:01 andreyglauzer

I did a job of migrating the database from BerlkelyDB to Cassandra, I'm having great results.

I'll leave but a few weeks of testing, and I'll return

Is there any docs on how perform this migration?

Jan 26 '22 09:01 backloop-biz

Is there any docs on how perform this migration?

This migration is not possible in a massive way, I had to create a new instance and open all cases and alerts via API to the cassandra database

Jan 26 '22 14:01 andreyglauzer

We are having the same issue as described above. Giving more and more resources to TheHive wouldn't resolve the issue. Disabling statistics solved our issue, mainly but it is really strange.

Aug 17 '22 11:08 Cyp-her

@andreyglauzer Hello what happen with your changes that you did to your system about thehive and cortex, what do you recommend to do?

Sep 28 '22 16:09 romarito90

I have the same issue with a VM having 16vCPU and 48GB RAM.

Nov 01 '22 10:11 baonq-me

We are having the same issue as described above. Giving more and more resources to TheHive wouldn't resolve the issue. Disabling statistics solved our issue, mainly but it is really strange.

Where can you disable the statistics for the frontend?

Jan 27 '23 14:01 Taragos

Me having the same issue on a physical with 2x Xeon 4210. When I hit the button "Stats", CPU comsumtion go straight to 100% and memory consumption of Thehive reported by systemd is about ~30GB.

Feb 06 '23 10:02 baonq-me

I am having the same issue.... it will eventually use all cpus at 100% and then just stop responding. i have to kill the process to get it to work again.

hive - 8cpu 32gb ram 3x elastic - 8 cpu 32gb ram each 3x cassandra - 8 cpu 32gb ram each

Jan 11 '24 22:01 bhjella-awake

from observations.... Elastic gets to like 600% cpu utilization then drops then thehive gets to 300% and stays there. this happens again, elastic 600% then the hive jumps to 600% like 20 min later. then elastic again 600% and hive now maxed at 800% and everything freezes.

feels like there is some thread that doesn't timeout and just spins forever

Jan 12 '24 18:01 bhjella-awake

from observations.... Elastic gets to like 600% cpu utilization then drops then thehive gets to 300% and stays there. this happens again, elastic 600% then the hive jumps to 600% like 20 min later. then elastic again 600% and hive now maxed at 800% and everything freezes.

feels like there is some thread that doesn't timeout and just spins forever

There is workaround by limiting number of CPU core consumed by ElasticSearch. By default, ElasticSearch use all CPUs available.

To limit CPU cores used, add this line to elasticsearch.yaml

node.processors: 4 # allow 4 CPUs to be used.

Jan 13 '24 04:01 baonq-me

thanks, but the issue isn't that is using cpu, its there is some thread that just never ends the constantly consumes the process. I have narrowed it down to when a specific user is using it. Going to try and determine what he is doing that is causing the hive to just churn CPU.. it normally sites arround 100-200 CPU when utilized durring work day. Except for this one user.

Jan 15 '24 01:01 bhjella-awake

Found out the problem on my side. It was a user using the stats button on the cases page:

I have implemented rules on my apache's reverse proxy to 401 the API requests for those:

        RewriteEngine On

        # Block query string name=case-by-tags-stats
        RewriteCond %{QUERY_STRING} name=case-by-tags-stats [NC]
        RewriteRule ^ - [R=401,L]

        # Block query string name=case-by-status-stats
        RewriteCond %{QUERY_STRING} name=case-by-status-stats [NC]
        RewriteRule ^ - [R=401,L]

        # Block query string name=case-by-resolution-status-stats
        RewriteCond %{QUERY_STRING} name=case-by-resolution-status-stats [NC]
        RewriteRule ^ - [R=401,L]

Jan 15 '24 18:01 bhjella-awake

TheHive TheHive copied to clipboard

[Bug] - high CPU consumption

Request Type

Work Environment

Problem Description

Steps to Reproduce

TheHive
TheHive copied to clipboard