IPED icon indicating copy to clipboard operation
IPED copied to clipboard

Optimize memory usage of WhatsApp merge backups feature

Open hauck-jvsh opened this issue 1 year ago • 12 comments

This out of memory occurs when parsing the msgstore databases. There are several huge msgstore, with approximately 2 GB each. The problem did not occur when I disable the <param name="recoverDeletedRecords" type="bool">false</param>. I have a thread dump, but I can't figure out what was the problem.

hauck-jvsh avatar Apr 17 '23 23:04 hauck-jvsh

Isn't this a duplicate of #1364?

lfcnassif avatar Apr 18 '23 00:04 lfcnassif

Could you check if the database schema is the old one (with the messageS table, not the message one)?

lfcnassif avatar Apr 18 '23 16:04 lfcnassif

I check and the database is using the old schema, but the some of the dbs are corrupted.

hauck-jvsh avatar Apr 18 '23 16:04 hauck-jvsh

Maybe it is caused by recover records feature, maybe by mergeDBs, maybe both together or other together. Just the heap dump would tell us...

lfcnassif avatar Apr 18 '23 17:04 lfcnassif

It finishes with successful with the mergeDB enable and with the recover records disable. I can try with the recover enable and merge disable, as maybe a problem with merging corrupted messages.

hauck-jvsh avatar Apr 18 '23 17:04 hauck-jvsh

What do you use to analise the hprof, I tried with java visual VM and it is computing retained sizes since yesterday.

hauck-jvsh avatar Apr 18 '23 17:04 hauck-jvsh

Eclipse memory analyzer plugin, much much better, you just need to increase the Xmx eclipse value.

lfcnassif avatar Apr 18 '23 17:04 lfcnassif

Hi @hauck-jvsh, could you confirm if this is a duplicate of #1364 or not?

lfcnassif avatar Apr 26 '23 11:04 lfcnassif

I cannot confirm because in this case the db is the old one. After digging into the problem I found that the problem is probably a combination of both, the merge and the undelete, as the undelete recovers a lot of items, and all itens found in several dbs are being held memory for the merge. This may be solved by adding processing time, instead of processing all dbs in parallel, I can change it to process only the master DB and each backup be processed and merged sequentially. This will reduce the memory requirements, but will process all db sequentially. What do you think?

hauck-jvsh avatar Apr 26 '23 23:04 hauck-jvsh

Is it possible to do a mixed approach? If the estimated memory to be used is less than x% of max heap memory, process in parallel, if it is greater than x% process them sequentially?

Since we already have an open issue to optimize memory usage by whatsapp records undelete, I'm changing the title of this to focus on the merge DBs feature memory usage here.

lfcnassif avatar May 03 '23 12:05 lfcnassif

If it is a simple guess like, the size of the DB, I think it could be done. I can make some tests to see the heap usage vs the DB size. What do you think?

hauck-jvsh avatar May 03 '23 15:05 hauck-jvsh

I think it is a reasonable approach.

lfcnassif avatar May 03 '23 20:05 lfcnassif