richdocumentscode
richdocumentscode copied to clipboard
Memory leak in coolwsd
I am using
- Nextcloud 23.0.3
- richdocuments 5.0.3
- richdocumentscode 21.11.204
This is a server with around 50-100 active users.
Problem:
After around 5 days of continuous running, coolwsd
process has a 30GB of resident memory taken and it is not releasing it. By this time, Collabora Office is either totally unresponsive (no documents are opened) or a document is opened but after 10 seconds, it says "connection to server lost" and kicks you out back to the folders. According to the logs, the 75% of RAM taken alert was already released a day before, so the memory consumption of coolwsd steadily rises over time. At the end, the OS OOM killer is triggered which kills the whole apache process tree as the originator of the coolwsd process. This last action also kills all other NC services, but Collabora is unusable well a day before the actual kill takes place.
The above happens regularly with 204. It did not happen with previous releases. I have just upgraded to 306. We will see how it performs. I guess there should be a memory leak somewhere in coolwsd. It is important to note that coolwsd process shows increased memory consumption, not other apache-related processes, nor the CollaboraOnline... process.
seem to have an similar problem Debian 11 server Nextcloud 23.0.3 richdocumentscode 21.11.306 nginx as webserver with php-fpm
It now uses around 37% of the memory, but i can see it rises over the last day, had this curve over the last two weeks.
After a week, I can also say that richdocumentscode 21.11.306 is affected the very same way unfortunately. It seems that the service starts to not respond way before OOM-killer finds it, so this issue definitely causes outage in the service. My only workaround for this is to automatically restart the service every night.
Version 21.11.402 is affected the same way.
Same thing here with NextCloud 24.0.1, Nextcloud Office 6.1.1, Collabora Online - Built-in CODE Server 22.5.401. The memory usage of coolwsd
increases for no apperant reason. I monitored the memory usage of the server and here ist the result:
So after only 5 days 37 GB are used in total. The total memory usage of the server is back to 2 GB after a restart of apache.
I monitored the memory usage further and added a daily restart of apache (the drops in memory usage are the restarts):
The memory usage also increases with the restart although slower. So it seems to be a real memory leak.
Reporting similar behaviour with:
richdocuments 6.2.1 richdocumentscode 22.5.502
Running Nextcloud 24.0.6 with 4GB of RAM.
Ubuntu 20.04.5 LTS nginx/1.18.0 (Ubuntu) PHP 8.0.14 fpm psql (PostgreSQL) 12.12
coolwsd
memory usage slowly increases over several days until it becomes unresponsive. Grafana screenshot, for example. The spike ending is when PHP service was restarted.
with one month stops working in 6 GB virtual machine, definitely a memory leak is there, I am using 22.05.8.2, it was not that bad in previous versions. Put some restarts into crontab.
I don't now what happened here, but it clearly looks like this is not a gradual leak, but something causing it to steadily allocate until… at some point, oomkiller jumped to the rescue…
Same problem here, but just with the collabora server from the nextcloud app store. Servers with external collabora server do not face this issue.
Servers with external collabora server do not face this issue.
Which version you @NetBLOKS run of collarbora (docker?) and what nc version, was well as setup procedure , using nginx php-fpm ? I always had these issues no matter what, but it was a few months prior
Servers with external collabora server do not face this issue.
Which version you @NetBLOKS run of collarbora (docker?) and what nc version, was well as setup procedure , using nginx php-fpm ? I always had these issues no matter what, but it was a few months prior
Which version you @NetBLOKS run of collarbora -> App Store Version (Collabora Online - Built-in CODE Server) and what nc version (Happens in 24, and 25. got latest 25.0.4), was well as setup procedure (Manual Install, Debian 11, Apache, PHP7.4-FPM)
Can confirm this is still an issue on the following stack:
Ubuntu 24.04 LTS Nginx 1.24 PHP 8.3-FPM Nextcloud 29.0.0 richdocuments: 8.4.2 richdocumentscode_arm64: 24.4.201
Nextcloud version:29.0.4.1 Red Hat Enterprise Linux release 8.10 (Ootpa) 10.6.18-MariaDB, Apache/2.4.37 PHP 8.3.10
- richdocuments: 8.4.4
- richdocumentscode: 24.4.502
Since Upgrading to Nextcloud 29.0.4.1 and upgrade PHP 8.2 to PHP 8.3 - Nextcloud Server is almost crashing, because php-fpm is consuming all the space in /tmp ==>
32G /tmp/systemd-private-4c513a85a5cb462b92e805310c385d9e-php-fpm.service-r9PDnv/tmp/coolwsd.LNJU02GnN5/jails/18443-d61991e6 39G /tmp/systemd-private-4c513a85a5cb462b92e805310c385d9e-php-fpm.service-r9PDnv/tmp/coolwsd.LNJU02GnN5
After restarting PHP-FPM Service the files were removed from /tmp directory.
Here you see, that "coolwsd" is eating all the space from /tmp dir of the server in a short time:
2024.08.04 03:15:02 - Space ok 18% /dev/mapper/server-root --Mount-- / 2024.08.04 03:30:01 - Space ok 19% /dev/mapper/server-root --Mount-- / 2024.08.04 03:45:01 - Space ok 20% /dev/mapper/server-root --Mount-- / 2024.08.04 04:15:03 - Space ok 22% /dev/mapper/server-root --Mount-- / 2024.08.04 04:30:03 - Space ok 23% /dev/mapper/server-root --Mount-- / 2024.08.04 05:00:01 - Space ok 24% /dev/mapper/server-root --Mount-- / 2024.08.04 05:15:02 - Space ok 25% /dev/mapper/server-root --Mount-- / 2024.08.04 05:30:01 - Space ok 26% /dev/mapper/server-root --Mount-- / 2024.08.04 05:45:02 - Space ok 27% /dev/mapper/server-root --Mount-- / 2024.08.04 06:15:01 - Space ok 28% /dev/mapper/server-root --Mount-- / 2024.08.04 06:30:04 - Space ok 32% /dev/mapper/server-root --Mount-- / 2024.08.04 07:00:48 - Space ok 33% /dev/mapper/server-root --Mount-- / 2024.08.04 07:30:01 - Space ok 34% /dev/mapper/server-root --Mount-- / 2024.08.04 07:45:01 - Space ok 35% /dev/mapper/server-root --Mount-- / 2024.08.04 08:00:01 - Space ok 36% /dev/mapper/server-root --Mount-- / 2024.08.04 08:15:02 - Space ok 37% /dev/mapper/server-root --Mount-- / 2024.08.04 08:30:01 - Space ok 38% /dev/mapper/server-root --Mount-- / 2024.08.04 08:45:01 - Space ok 39% /dev/mapper/server-root --Mount-- / 2024.08.04 09:00:02 - Space ok 40% /dev/mapper/server-root --Mount-- / 2024.08.04 09:15:01 - Space ok 41% /dev/mapper/server-root --Mount-- / 2024.08.04 09:30:01 - Space ok 42% /dev/mapper/server-root --Mount-- / 2024.08.04 09:45:02 - Space ok 43% /dev/mapper/server-root --Mount-- / 2024.08.04 10:00:01 - Space ok 44% /dev/mapper/server-root --Mount-- / 2024.08.04 11:00:03 - Space ok 46% /dev/mapper/server-root --Mount-- / 2024.08.04 11:15:01 - Space ok 47% /dev/mapper/server-root --Mount-- / 2024.08.04 11:30:01 - Space ok 49% /dev/mapper/server-root --Mount-- / 2024.08.04 11:45:01 - Space ok 51% /dev/mapper/server-root --Mount-- / 2024.08.04 12:00:02 - Space ok 54% /dev/mapper/server-root --Mount-- / 2024.08.04 12:15:01 - Space ok 55% /dev/mapper/server-root --Mount-- /
Also the memory on the serveris decreasing and decraesing ==>
See also those errors in php-fpm.log ==>
PHP Fatal error: Uncaught TypeError: implode(): Argument #1 ($array) must be of type array, string given in /var/www/html/nextcloud/apps/richdocumentscode/proxy.php:398 Stack trace: #0 /var/www/html/nextcloud/apps/richdocumentscode/proxy.php(398): implode() #1 {main} thrown in /var/www/html/nextcloud/apps/richdocumentscode/proxy.php on line 398 [04-Aug-2024 12:08:26 richdocumentscode (proxy.php) error exit, PID: 509268, Message: No content in reply from coolwsd. Is SSL enabled in error ? [04-Aug-2024 12:08:26] PHP Warning: http_response_code(): Cannot set response code - headers already sent (output started at /var/www/html/nextcloud/apps/richdocumentscode/proxy.php:30) in /var/www/html/nextcloud/apps/richdocumentscode/proxy.php on line 34 [04-Aug-2024 12:19:49 PHP Warning: http_response_code(): Cannot set response code - headers already sent (output started at /var/www/html/nextcloud/apps/richdocumentscode/proxy.php:285) in /var/www/html/nextcloud/apps/richdocumentscode/proxy.php on line 292
Workaround:
- [ 1 ] - did restart apache, redis and php-fpm - now the memory- & space consumption is normal again
restart apache does solve the issue temporarely .. after a short while .. collwsd will write about 5-10GB per hour .. 100-200GB per day !
This is a big issue, which affects the server stability
@Githopp192 This issue is about a memory link, not disk space / /tmp
.
By reading my comments, check the graph, too - i've affected by a memory leak, too
"Also the memory on the server is decreasing and decreasing ==>"
See the image obove ...