morphodict
morphodict copied to clipboard
Memory and performance issues
This is really an organization-wide issue, for now keeping it in morphodict.
I've just checked out the RAM consumption of our machines, and it seems that it goes about as follows:
speech-db4GBitwewina.app11.9 GB- non-itwewina
sssttt.altlab.dev4.9GB backend+frontendrefactorings 3.6GBsemantic-explorer1.2GBkorpportions 0.2GB
This doesn't count itwewina.dev. But when we ramp up a parallel itwewina.dev version with a full dictionary, we run close to the memory limits of our server, which could eventually lead to unexpected service drops. The itwewina.dev service stopped processing requests on its own, likely because of memory constraints, and I've decided to take it down at least until a reboot of the production itwewina (at a time less likely to disrupt others), perhaps even until after CILLDI ends. But most importantly, we need to address the memory issue with urgency. Some suggestions (first one is the most straightforward):
- [ ] Requesting a machine with more memory for the server from Compute Canada (say, 48GB at minimum, ideally 64GB RAM o more)
- [ ] Limiting the per-docker container maximum amount of memory and the memory available for uWSGI (python server), perhaps even looking at alternatives to using uWSGI, and ensuring that such restrictions would trigger image restarts that would keep services going instead of randomly stopping processing requests. Note that memory restrictions may not be feasible if all the memory is required for computation instead of caching.
- [ ] Optimize the apps' use of memory (from profiling, etc.), and document where most of the memory consumption is currently going to.
In parallel, it seems that the generation of sound recordings for paradigms can impose a considerable load on speech-db CPU usage (bursts per-request, which are not an issue currently but would scale to a problem if many people are using them). We should consider separating the service and functionality of providing recordings for dictionary purposes from the validation and recording services provided by speech-db, to avoid the former taking down the latter.
What we can receive from Digital Resource Alliance of Canada is under the persistent option, cf. https://docs.alliancecan.ca/wiki/Cloud_RAS_Allocations
The available resources have historically increased gradually, though slowly. Anyhow, we should have up to 50GB ram available to us.
I've reactivated https://itwewina.altlab.dev/, with the latest dictionary and with the updates I worked on so far.
Restarting VMs as part of upgrading seems to considerably reduce the memory consumption. Better limits should be introduced in uwsgi.