self-hosted
self-hosted copied to clipboard
Sentry periodically leads to high CPU utilisation and high LA.
Self-Hosted Version
20.11
CPU Architecture
x86_64
Docker Version
23.0.2
Docker Compose Version
2.17.2
Steps to Reproduce
- Just install Sentry and wait
- Noticed high LA and CPU load, this leads to problem with Redis connectivity and container reload.
Expected Result
Expect normal CPU load and no problem with container\service connectivity.
Actual Result
The server: 4 vCPU, 8 RAM, 200G HDD, Rocky Linux 9.2 At the time of the peak (around 8.30am) the Sentry stopped working. I've attached the logs.
When I logged into the server, I saw that the celeryd process was heavily utilising the CPU This is not the first time this has happened, is there anything you can recommend?
Event ID
No response
I notice you are still on 20.11, which is almost 3 years old at this point, so our first recommendation would be to upgrade. While I couldn't find a fix for this specific issue in the mainline, high CPU usage is very salient for the mainline sentry
repo as well, so I'd imagine we've made improvements in this area over the last 3 years. There is a tradeoff here though, as we have added more containers in that timespan, so the baseline resource usage may actually increase a bit.
Hey guys, any news regarding this? I am facing the same. sentry-worker and clickhouse containers take all the CPU and lead to a high load, although I don't have that many errors coming. Thanks a lot in advance.
Sentry version: Sentry 23.3.1
Are you also seeing load spikes periodically, or just a high baseline load?
I get load spikes every day during working hours, 7-8 hours per day, and it goes too high for 20-30 secs, then comes back to normal for 5-10 secs and goes high for another 20-30 secs, and the same periodically all day long.
Do you have any visibility on which of the docker containers is spiking? That will help us route the problem to the right team.
What amount of CPU/RAM are you using? We cannot guarantee that self-hosted Sentry runs well on every setup. This will probably be left on the backlog to investigate.
First, we had an instance with 4 CPU and 16 GB RAM. But I had to upgrade the instance to 8 CPU and 32 GB RAM to handle the load of 50-55k/week, which I guess is too low for the server not to be able to handle.
Thanks for the datapoint
I have the same problem,The server with 4 CPU Cores And 8 GB RAM configuration seems to be insufficient,Do I need to improve the server configuration or can I optimize this issue from the code level
You could disable some features and tweak the docker-compose.yml
file if you don't use them to improve your resource usage. For example, if you're not using replays you can disable everything related to that. However, whatever you decide to disable would be specific to your own use case.
I get load spikes every day during working hours, 7-8 hours per day, and it goes too high for 20-30 secs, then comes back to normal for 5-10 secs and goes high for another 20-30 secs, and the same periodically all day long.
The same strange behavior occurs, with peaks occurring at moments when there are no logs at all.