recognize icon indicating copy to clipboard operation
recognize copied to clipboard

Allowed memory size of<> bytes exhausted (MrdBallTree.php#450)

Open GHBLoos opened this issue 7 months ago • 10 comments
trafficstars

Which version of recognize are you using?

8.2.0

Enabled Modes

Face recognition

TensorFlow mode

Normal mode

Downstream App

Memories App

Which Nextcloud version do you have installed?

30.0.4

Which Operating system do you have installed?

Linux 4.18.0-513.5.1.el8_9.x86_64 x86_64

Which database are you running Nextcloud on?

mysql/MariaDB 10.3.39

Which Docker container are you using to run Nextcloud? (if applicable)

No response

How much RAM does your server have?

7.5GB

What processor Architecture does your CPU have?

AMD EPYC Processor (with IBPB) (4 threads)

Describe the Bug

Error in log file (I always get them two at the time). At night I receive them (probably the cron job) and also when uploading (multiple) images.

[PHP] Fout: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 655360 bytes) at /var/www/cloud.DOMAIN.TLD/html/apps/recognize/lib/Clustering/MrdBallTree.php#450
	from ? by -- at 2 apr 2025, 20:49:43

I have these errors for a long time.

My PHP info:

Version: 8.2.12
Memory limit: 1 GB
Max time: 3600
Max. upload size: 1 GB
Frequentie OPcache validation: 2

Extensions: Core, date, libxml, openssl, pcre, zlib, filter, hash, json, random, Reflection, SPL, session, standard, cgi-fcgi, bcmath, bz2, calendar, ctype, curl, dom, mbstring, fileinfo, ftp, gd, gettext, gmp, iconv, intl, exif, mysqlnd, PDO, Phar, posix, shmop, SimpleXML, sockets, sodium, sqlite3, sysvmsg, sysvsem, sysvshm, tokenizer, xml, xmlwriter, xsl, mysqli, pdo_mysql, pdo_sqlite, xmlreader, zip, apcu, igbinary, imagick, msgpack, redis, Zend OPcache

Extra info, probably not relates:

Image

Also after manually downloading the models, this status keeps returning

Expected Behavior

I don't expect an error. I have no clue if this means the image is not analysed

To Reproduce

  1. Uploading images (I think it's always with multiple images)
  2. cron job at night

Debug log

No response

GHBLoos avatar Apr 02 '25 19:04 GHBLoos

Hello @GHBLoos It appears this is an issue with the clustering batch size and the PHP memory limit. Could you try increasing the PHP memory limit to 2GB perhaps? If this is not an option you can try running the occ recognize:cluster-faces command with the --batch parameter set to a number lower than 10000 (which is the default for the cron job), e.g. try occ recognize:cluster-faces -b 5000.

marcelklehr avatar Apr 07 '25 11:04 marcelklehr

Could you try increasing the PHP memory limit to 2GB perhaps? Unfortunately that did not work. I will try the --batch parameter soon

GHBLoos avatar Apr 07 '25 19:04 GHBLoos

@02:15

occ recognize:cluster-faces -b 5000
[root@<srv> html]# sudo -u apache php occ recognize:cluster-faces -b 5000
Clustering face detections for user <>
ClusterDebug: Retrieving face detections for user <>
ClusterDebug: Not enough face detections found
Clustering face detections for user <>
ClusterDebug: Retrieving face detections for user <>
...
Clustering face detections for user <>
ClusterDebug: Retrieving face detections for user <>
ClusterDebug: Found 2087 fresh detections. Adding 1250 old detections and 1663 sampled detections from already existing clusters. Calculating clusters on 5000 detections.
ClusterDebug: Clustering complete. Total num of clustered detections: 1245
Clustering face detections for user <>
ClusterDebug: Retrieving face detections for user <>
ClusterDebug: Found 500 fresh detections. Adding 1250 old detections and 3310 sampled detections from already existing clusters. Calculating clusters on 5060 detections.
ClusterDebug: Clustering complete. Total num of clustered detections: 87

@03:08

PHP Fatal error:  Allowed memory size of 3221225472 bytes exhausted (tried to allocate 655360 bytes) in /var/www/<CLOUDDOMAIN>/html/apps/recognize/lib/Clustering/MrdBallTree.php on line 448

GHBLoos avatar Apr 08 '25 19:04 GHBLoos

Can I manually change the cron job? Can you give a hint where to find the cron job file for Recognize?

GHBLoos avatar Apr 09 '25 16:04 GHBLoos

You can manually change this line: https://github.com/nextcloud/recognize/blob/main/lib/BackgroundJobs/ClusterFacesJob.php#L23

marcelklehr avatar Apr 10 '25 08:04 marcelklehr

I will test this and let you know the results. I suppose it will be overwritten every time there is an update of the recognize app?

GHBLoos avatar Apr 10 '25 08:04 GHBLoos

Yes, that's the drawback. From my tests, AFAIR, a batch size of 10k should be fine for the recommended memory limits, though. I'm not sure why that's different in your case.

marcelklehr avatar Apr 10 '25 09:04 marcelklehr

Can I provide you any info from my server to figure out the reason?

My first impression is that the new batch size seems to work, but I will monitor it the next days. If it works, it would be great if this batch size could become a setting.

GHBLoos avatar Apr 10 '25 10:04 GHBLoos

I haven't received the initial error, but I did receive a lot of new errors. And when checking the 'faces' in the photo app, everything seems mixed up again. It looks like the faces that where manually combined have been separated again.

{"reqId":"EUfg89Yj1hFyJZFTPfD8","level":2,"time":"2025-04-11T01:11:07+00:00","remoteAddr":"","user":"--","app":"cron","method":"","url":"--","message":"Cron job used more than 300 MB of ram after executing job OCA\\Recognize\\BackgroundJobs\\ClusterFacesJob (id: 730433, arguments: {\"userId\":\"USER\"}): 746.6 MB (before: 10.8 MB)","userAgent":"--","version":"30.0.8.1","data":{"app":"cron"}}
{"reqId":"Q4tI3XPnvnts4FBHvJT6","level":2,"time":"2025-04-11T01:18:56+00:00","remoteAddr":"","user":"--","app":"cron","method":"","url":"--","message":"Cron job used more than 300 MB of ram after executing job OCA\\Recognize\\BackgroundJobs\\ClusterFacesJob (id: 730435, arguments: {\"userId\":\"USER\"}): 748 MB (before: 47.9 MB)","userAgent":"--","version":"30.0.8.1","data":{"app":"cron"}}

This error appears thousands of times

{"reqId":"BS2GeHks3Wwai7F4kPvO","level":3,"time":"2025-04-12T03:42:01+00:00","remoteAddr":"","user":"--","app":"PHP","method":"","url":"--","message":"Undefined variable $nodeDistance at /var/www/<cloud.domain>/html/apps/recognize/lib/Clustering/MrdBallTree.php#192","userAgent":"--","version":"30.0.8.1","data":{"app":"PHP"}}

GHBLoos avatar Apr 12 '25 18:04 GHBLoos

The $nodeDistance error is a known issue which has been hard to solve, see #965

The Cron job used more than 300 MB of ram warning can be ignored as the high RAM usage is expected.

marcelklehr avatar Apr 14 '25 06:04 marcelklehr

The Cron job used more than 300 MB of ram warning can be ignored as the high RAM usage is expected.

I belive that when Nextcloud's AIO is used, this memory limit is actually enforced somehow, because followed by the warning log, there is fatal log:

Fatal | core | Request used more than 300 MB of RAM: 639.1 MB

Croydon avatar Oct 02 '25 04:10 Croydon

I get this error 2x every night (when recognise runs) consistently. I have upped the memory limit to -e PHP_MEMORY_LIMIT=1536M and it still runs out, this is silly right? I don't have infinite ram to dedicate to this when it can run smaller batches, or is there a downside to doing 500 at a time instead of 10k?

ThaChillera avatar Nov 12 '25 20:11 ThaChillera

Hi @ThaChillera Yes, a smaller batch size would be the solution. The risk with smaller batch sizes is that the clustering quality worsens, so I wouldn't make the batch size as small as possible. Perhaps try with 8k or 5k first. Please report back here how it goes, then we can adjust this permanently in the code 🙏

marcelklehr avatar Nov 13 '25 07:11 marcelklehr