Extremely high memory utilization running locally on certain linux distributions
Bug description On Redhat variants containers are taking an unreasonably large amount of memory running
tutor local start
The issue is especially acute with the mysql container, but is also an issue with the lms and cms.

Memory is consumed until the OOM kills the container, which is then immediately restarted.
How to reproduce
Everything is vanilla.
Simply using
tutor local start
Environment OS: 5.17.7-200.fc35.x86_64 Tutor: tutor, version 13.2.2
Additional context This can/has been resolved by adding the following to a few service definitions in the local version of docker-compose.yml
mysql:
blah: foo
...
ulimits:
nproc: 65535
nofile:
soft: 26677
hard: 46677
If you are amenable, I can submit a PR to add this to the template with config variables. Sane limits seem good generally. I would consider adding them to mysql, mongo, elasticsearch, lms, and cms.
After picture looks like so:

I need to investigate this further. A quick search has yielded the following results:
- https://github.com/docker-library/mysql/issues/579
- https://bugzilla.redhat.com/show_bug.cgi?id=1708115
I think it could be a bug between docker and your distro/kernel.
Here I use a Swarm Cluster in production and I add resources config in the docker-compose file.
Running tutor locally I've never had this problem.
Example:
services:
service:
image: nginx
deploy:
resources:
limits:
cpus: 0.50
memory: 512M
reservations:
cpus: 0.25
memory: 128M
You can do the same using customizing-the-deployed-services.
Docker sometimes behaves strange, and is hard to debug in that case. For another customer, running docker on an old Fedora installation (basically a thumbnailer with puppeteer and puppeteer cluster), using Yelp dumb-init, after 2...3 days the whole docker systemd process was nuked, with no helpful entry in the logs. I switched to tini (thanks to Tutor inspiration), now the container is up and running more than a week already without any problem!
I'm not a big fan of settings ulimits in stone for the Open edX containers. It might help us resolve this particular bug, but other issues are sure to appear later when some people need to exceed these limits. I suspect that there is an underlying bug not related to Tutor. Can you investigate this issue further @e0d? In particular, what version of Docker are you running?
Using overrides is an acceptable solution, but personally I prefer setting sane limits because the failure mode is better. If the container cannot start because it needs more memory than it is allowed to have, that is "fast failure" and the message is likely clear. When a container takes all of your free RAM such that the system becomes nearly unresponsive and the OOM is fighting Docker compose, it's a bit of a mess.
I suspect that there's probably a system-wide limit in MacOS and Ubuntu that prevents MySQL from using as much memory as it can. From what I can tell, this would be an issue in RedHat variants and Arch.
I'm using Docker version 20.10.16.
If nobody else ever sees the issue, it may not be worth the effort.
Adding a +1 to this since I experienced it as well for the first time today. I fixed it by following these instructions.
Thanks for commenting @keithgg. Can you please revert your fix so that we can diagnose a little more precisely? What's your OS?
What's the output of the following commands?
$ cat /proc/$(pgrep dockerd)/limits
$ systemctl cat docker.service | grep LimitNOFILE
@regisb
Can you please revert your fix so that we can diagnose a little more precisely?
Sure, just let me know what you need. I took this screenshot before making the fix. The LMS/CMS and MySQL containers are all using as much memory as they can. FWIW, I don't think this is necessarily a tutor issue. There's more discussion happening here.

What's your OS?
I'm running EndeavourOS (which is basically Arch). My limits are have always been high, because Javascript. I didn't change any of them when trying to fix this issue. Just made the fix mentioned above.
$ cat /proc/$(pgrep dockerd)/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes unlimited unlimited processes
Max open files 1073741816 1073741816 files
Max locked memory 8388608 8388608 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 192261 192261 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
$ systemctl cat docker.service | grep LimitNOFILE
LimitNOFILE=infinity
OK. I think I have now a better understanding of what this issue is.
For me, the fact that the issue affects multiple containers (LMS, CMS, mysql) is a confirmation that we should not hardcode ulimits for everyone.
Still, users on RedHat/Fedora/Arch are bound to face this issue. So at the very least we should add a section to the troubleshooting docs: https://docs.tutor.overhang.io/troubleshooting.html
@keithgg would you like to open a PR or should I do it?
I didn't have time to dig deeply into this, so my solution is rather ham-fisted.
In .local/share/tutor/env/local/docker-compose-override.yml I have added:
services:
mysql:
deploy:
resources:
limits:
memory: 2G
ulimits:
nproc: 65535
nofile:
soft: 26677
hard: 46677
credentials:
deploy:
resources:
limits:
memory: 2G
discovery:
deploy:
resources:
limits:
memory: 2G
lms:
deploy:
resources:
limits:
memory: 2G
cms:
deploy:
resources:
limits:
memory: 2G
ecommerce:
deploy:
resources:
limits:
memory: 2G
mongodb:
deploy:
resources:
limits:
memory: 2G
redis:
deploy:
resources:
limits:
memory: 2G
elasticsearch:
deploy:
resources:
limits:
memory: 2G
lms-worker:
deploy:
resources:
limits:
memory: 2G
cms-worker:
deploy:
resources:
limits:
memory: 2G
ecommerce-worker:
deploy:
resources:
limits:
memory: 2G
@keithgg would you like to open a PR or should I do it?
@regisb I'll leave this one to you :slightly_smiling_face: . Just a note that my daemon.json my limits are 256000 instead of the 64000 in the link. When it was lower I got golang errors of the form panic: runtime error: index out of range [159] with length 145 building images for dev.
{
"default-ulimits": {
"nofile": {
"Hard": 256000,
"Name": "nofile",
"Soft": 256000
}
}
}