TheHive icon indicating copy to clipboard operation
TheHive copied to clipboard

[Bug] Cortex analyzer job status not updated in Thehive

Open martinr103 opened this issue 2 years ago • 1 comments

Request Type

Bug

Work Environment

Question Answer
OS version (server) Debian
TheHive version / git hash 4.1.21 (and earlier)
Cortex version 3.0.1
Package Type Binary (in Docker)
Database Cassandra
Index type Lucene
Attachments storage Local

Problem Description

There seems to be a glitch in the interaction between Thehive and Cortex. This does not happen always, but especially in the scenario when multiple Analyzer execution jobs are submitted simultaneously, part of the started jobs does not get an update in Thehive. (round about 10-15% ?) These jobs remain in status "Waiting" or "InProgress" in Thehive forever. Even despite the fact that the very same jobs are tracked as completed (even with Success) inside of Cortex itself.

Example:

TH API call to check the job status (even after a few minutes) :

$ curl -k -u xxxxxxx https://localhost:9443/api/connector/cortex/job/~259563688?name=observable-jobs {"_type":"case_artifact_job","analyzerId":"bb66101567a040ee22839a5a821ecb8e","analyzerName":"OTXQuery_2_0","analyzerDefinition":"OTXQuery_2_0","status":"Waiting","startDate":1660069434170,"endDate":1660069434170,"cortexId":"prod cortex","cortexJobId":"AYKD2VM73l414zXlQvLs","id":"~259563688","operations":"[]"}

"status":"Waiting" (bad, not synced from Cortex)

Cortex API call to check the very same job:

$ curl -k -H 'Authorization: Bearer ...............' 'https://localhost:9444/api/job/AYKD2VM73l414zXlQvLs' {"date":1660069434170,"data":"adition.com","endDate":1660069441029,"type":"analyzer","cacheTag":"d08fc23b319c29c9dc9acededbc145e0","createdAt":1660069434170,"_parent":null,"id":"AYKD2VM73l414zXlQvLs","_version":3,"pap":2,"updatedAt":1660069441029,"_routing":"AYKD2VM73l414zXlQvLs","workerId":"bb66101567a040ee22839a5a821ecb8e","updatedBy":"xxxxxx","analyzerName":"OTXQuery_2_0","analyzerDefinitionId":"OTXQuery_2_0","dataType":"domain","_type":"job","message":"185562","analyzerId":"bb66101567a040ee22839a5a821ecb8e","createdBy":"xxxxxxx","organization":"xxxxxxx","tlp":0,"workerDefinitionId":"OTXQuery_2_0","_id":"AYKD2VM73l414zXlQvLs","workerName":"OTXQuery_2_0","parameters":{"organisation":"xxxxxxxx","user":"n8n@xxxxxxxxx"},"startDate":1660069434791,"status":"Success"}

"status":"Success" (ok)

(BTW. if you view the status of the Analyzer execution in Thehive UI, under Case/Observables - you will see a "spinning wheel" forever)

Steps to Reproduce

Goal is to try to start a couple of analyzer jobs (like 10-15) in a very short time one after each other. Ideally via API.

  1. Create a Case with a few Observables
  2. Prepare POST data for some analyzer executions
  3. Submit a number of POST requests to the /connector/cortex/job API endpoint, shortly one after another (scripted)
  4. Observe the status of the executed jobs

martinr103 avatar Aug 11 '22 08:08 martinr103

Hello @martinr103 Please check the Cassandra logs. I had a similar problem and fixed it.

In the Cassandra logs, I found:

Caused by: org.apache.cassandra.transport.messages.ErrorMessage$WrappedException: org.apache.cassandra.exceptions.InvalidRequestException: Request is too big: length 22802476 exceeds maximum allowed length 16777216.

When I set

commitlog_segment_size to 64MiB
native_transport_max_frame to 32MiB

Then, the analyzer jobs updated successfully.

dream91 avatar Mar 05 '24 14:03 dream91