chart
chart copied to clipboard
Added memory requests to be more realistic.
This makes Kubernetes make better choices about where to schedule the pods, and communicates to the administrators about the minimum sensible resource requirements.
On a single user Mastodon instance on a three node Kubernetes after a week of so use we get these memory uses per pod:
tero@arcones:~$ kubectl top pods -n mastodon
NAME CPU(cores) MEMORY(bytes)
mastodon-elasticsearch-coordinating-0 6m 403Mi
mastodon-elasticsearch-coordinating-1 28m 189Mi
mastodon-elasticsearch-data-0 10m 1432Mi
mastodon-elasticsearch-data-1 5m 1513Mi
mastodon-elasticsearch-ingest-0 6m 418Mi
mastodon-elasticsearch-ingest-1 6m 396Mi
mastodon-elasticsearch-master-0 24m 466Mi
mastodon-elasticsearch-master-1 10m 221Mi
mastodon-postgresql-0 12m 276Mi
mastodon-redis-master-0 16m 37Mi
mastodon-redis-replicas-0 7m 34Mi
mastodon-sidekiq-all-queues-549b4bb7b4-zvj2m 266m 499Mi
mastodon-streaming-78465f778d-6xfg2 1m 96Mi
mastodon-web-774c5c94f9-f5bhz 22m 418Mi
Hence we make the following adjustments to Bitnami defaults:
-
mastodon-elasticsearch-coordinating
:256Mi->512Mi
-
mastodon-elasticsearch-data
: The default2048Mi
is ok. -
mastodon-elasticsearch-master
:256Mi->512Mi
-
mastodon-redis-master
:0->56Mi
-
mastodon-redis-replicas
:0->56Mi
-
mastodon-postgresql
:256->384Mi
And for Mastodon defaults:
-
mastodon-sidekiq-all-queues
:0->512Mi
-
mastodon-streaming
:0->128Mi
-
mastodon-web
:0->512Mi
The original idea of keeping these requests zero is a good default when minimal requirements are unknown. However, from a single user node we get minimal requirements and having the limits as zero only leads to trouble for people. Of course the system requirements will change over time, but they are chiefly expected to go upwards.
I realize this is an opinionated change, and feel free to close it if you disagree with the rationale. I am in progress of testing this, putting it here already for transparency.
This has been tested as far as template generation goes. The template is generated as one would expect.
Confirmed working in a cluster as well.
single-user instance here as well, but I'm seeing 80Mi for redis and my sidekiq usage is just right next to 700MiB
I think adding these to values.yaml
as documentation (commented-out) is a good idea though
@deepy, thanks for an added validation! Commented out suggestions would be fine as well. However, as default values in my opinion these would be better than zeros which they are now.
Requests denote the minimum required for a pod to be scheduled on a node, so the values in this PR would be better than no such requests set for any instance, even yours with slightly higher usages. Of course you might want to tune them up a bit still in your cluster.
Limits are not set, so the pods can take as much memory as they want. The suggested requests don't affect that.
Thanks for this, better defaults would definitely be a good idea.
FYI, here is out current usage for mastodon.online:
mastodon-web: 1300M
mastodon-sidekiq-default-push-ingress: 900M
mastodon-sidekiq-pull: 1000M
mastodon-streaming: 200M
Could you rebase your PR to latest main
?
Also, do you have an opinion about adding default limits as well, at least for Mastodon processes?
Something like 300M for streaming, 2000M for web, 1500M for Sidekiq? I feel like its a good idea to have some to avoid long-running memory leaks.
Also, do you have an opinion about adding default limits as well, at least for Mastodon processes?
Rebased!
I would be hesitant of adding limits, unless really high. These could cause sudden problems to people. For example my current numbers are:
tero@betanzos:~$ kubectl top pods -n mastodon
NAME CPU(cores) MEMORY(bytes)
mastodon-elasticsearch-coordinating-0 9m 376Mi
mastodon-elasticsearch-coordinating-1 6m 217Mi
mastodon-elasticsearch-coordinating-2 10m 185Mi
mastodon-elasticsearch-data-0 7m 1302Mi
mastodon-elasticsearch-data-1 12m 733Mi
mastodon-elasticsearch-data-2 9m 1000Mi
mastodon-elasticsearch-ingest-0 6m 357Mi
mastodon-elasticsearch-ingest-1 10m 244Mi
mastodon-elasticsearch-ingest-2 15m 190Mi
mastodon-elasticsearch-master-0 12m 223Mi
mastodon-elasticsearch-master-1 61m 436Mi
mastodon-elasticsearch-master-2 16m 280Mi
mastodon-postgresql-0 26m 1551Mi
mastodon-redis-master-0 26m 128Mi
mastodon-redis-replicas-0 17m 129Mi
mastodon-redis-replicas-1 16m 135Mi
mastodon-redis-replicas-2 14m 129Mi
mastodon-sidekiq-all-queues-7cdbd75cdd-99mp7 545m 2487Mi
mastodon-streaming-58f74f74c4-vwldv 1m 82Mi
mastodon-web-948bd9cc-xxr6h 51m 4045Mi
The instance is rukii.net and has 6 users now.
Current web pod memory use is high now probably because I'm running all sorts of scheduled tootctl
scripts there now. It's also possible I have a long-running memory leak there.
We know that we have a memory leak in the ingress
queue at the moment at least, this is one of the reason I am suggesting having memory limits in place by default to correctly restart pods with an abnormal memory usage.
I should be able to give you more data from mastodon.social soon, I guess if we base the limits on this instance's usage then everybody else should be fine :)
Can we add the limits in a separate PR, because I think they require a separate discussion and be liable to be closed or reverted in isolation?
I just tested setting the memory limit on web and sidekiq pods to 2048MB, seems to work so far but might cause problems for running heavy tootctl commands like refresh on those same pods. Newly started pods take much less memory than ones that had been running for weeks. That is, they are less than 512 MB now when they were many gigabytes after having run for weeks.
I agree that setting limits in a separate PR is worthwhile. I strongly second @renchap's opinion that limits should skew low as a counter to memory leaks within both web and sidekiq. The ingress leak was a big motivating factor in my shift from a minimal docker-compose stack to a fuller k8s deployment. I also don't think 4Gi is a reasonable memory stat for a server of 6 users, regardless of uptime. It's been rare for my ~15 user instance to top 700Mi at peak hours, even when we participated in a couple large relays as well. That points to some potential leaks IMHO, but that's neither here nor there.
To @keskival's point, I think that just goes to show there's more discussion to be had there. I also think a separate issue/PR will give scope to add HPAs to our toolbox here. With a memoryUtilization
target in a sweet spot between request
and limit
, the autoscaler can spin up a fresh instance in time to create a rolling replacement with no downtime.
The sidekiq pods obviously lend themselves well to autoscaling, but I've also been running two web pods in parallel without issue. I know the admin at infosec.exchange needed to implement sessionAffinity because their S3 upload was really slow and multipart uploads were getting split between web pods. I haven't run into that problem, but it appears to be a minor hurdle anyway.