gorse
gorse copied to clipboard
Production Gorse - hardware requirements
Hi,
I haven't found any information about this, so will ask here. What are the expectations for the hardware that each type of node will run on? To be more specific - who does the "heavy" lifting in Gorse? Seems like the worker node does all the number-crunching.
So for the sake of the example:
- what would be reasonable hardware to run a single worker node (or multiple) for an application with 10k daily active users that will generate 10 to 20 feedback entries each, and request 100-200 recommendations each?
What should be the main focus? CPU, number of cores, amount of RAM, etc?
Thanks!
Sadly, we haven't test Gorse using different datasets. We plan to do benchmark in the future.
Any ballpark numbers you can share based on running the GitRec?
Hi @Drabuna, I can give you some hints about resource usage from my experience. :smile: Note that my setup has 150k items, 1.5M users, and 10M feedbacks.
In my case, the master node eats the most resources because it has to do the neighbor searching (matchup items and users by labels). Gorse wasn't ready for that large dataset and RAM usage was peaking periodically up to 120GB from ~30GB (until the Gorse died because it had no RAM left). This problem is AFAIK in the progress of solving. Master used ~4 CPU.
I used 8 workers and each used ~5GB RAM and 1 CPU. I did not stress test the API. Just inserting feedback took very little resources. Something like 0.5 CPU and 0.5GB RAM for each of 4 instances.
Redis used ~10GB RAM and 1 CPU.
To summarize, the whole Gorse with this dataset takes about 100 RAM and 15 CPU. Again, these numbers should only serve as a hint for deciding how many resources should you have - you have to test it on your dataset to have exact numbers.
Hey @PetrBezousek, Thanks a lot, that gives a pretty good idea of how much resources I might need!