Bug: Not able to purge correlations in WebUI
Expected behavior
Changing WebUI MISP.completely_disable_correlation from false to true a purge job is created, which should truncate the correlations table.
Actual behavior
The purge correlations job is created but hangs, never progressing, and the correlations table is not truncated.
WORKAROUND: I was able to perform truncate correlations; in MySQL, and then in WebUI Administration > Server Settings & Maintenance > Diagnostics, scrolled down to and clicked Legacy Administrative Tools and clicked Recorrelate attributes and that job launched and is progressing with rows being added to the table.
Steps to reproduce
See above
Version
develop branch
Operating System
RH
Operating System version
8.4
PHP version
7.4
Browser
N/A
Browser version
No response
Relevant log output
Don't see anything in MISP `error.log`
Extra attachments
No response
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Note that this issue was introduced a number of versions back. Not sure at which point it appeared but I first saw it as far back as ~4/8/22 - 4/15/22.
Data Point. After github-germ purged and started rebuild of correlations table, I still see the following artifacts:
I just ran show correlations in mysql and still see 10 million+ correlation entries with a date of 1000-01-01:
| 10723396 | 1000-01-01 |
|---|---|
| 10723397 | 1000-01-01 |
| 10723398 | 1000-01-01 |
| 10723399 | 1000-01-01 |
--------------------+
10678796 rows in set (3.34 sec)
What does this date signify?
mysql> select count(*) from correlations where date like '1000-01-01';
+----------+
| count(*) |
+----------+
| 10678796 |
+----------+
@packet-rat Thats OK, date column is not used anymore. It is there just for backward compatibility. You can drop data and info columns from correlation table to save space.
Glad to see info no longer in use. Thanks!
Really appreciate you fixing this issue so fast :-)
Question: why do we see correlations rows going from 19M to 10M after a rebuild, i.e. what is causing what appears to be so many orphans? Should we be adding Remove orphaned correlations (in WebUI: Administration > Server Settings & Maintenance > Diagnostics) to our SOP, and if so, is there a method to add that task as a cron job?
The fix in develop branch works well. THANKS.
Would appreciate any insight you can provide on my question above.
Question: why do we see correlations rows going from 19M to 10M after a rebuild, i.e. what is causing what appears to be so many orphans? Should we be adding Remove orphaned correlations (in WebUI: Administration > Server Settings & Maintenance > Diagnostics) to our SOP, and if so, is there a method to add that task as a cron job?
I changed how correlations are generated, so now it should generate less duplicates.
Great!