Can "worker" can be scaled-up & down realtime?
Hi,
Thanks for develop and maintaining this awesome project.
I'm currently trying to make judge handle massive amount of submission, so trying multiple way to scale this up.
I've checked #116, #118, #226 and #211, also Robust and Scalable Online Code Execution System in IEEE.
This problem may have already answered, I'm sorry to ask you again because I can't guarantee that it's possible to do what I'm trying to do.
- I understood that the
countvariable insidejudge0.confcannot be changed dynamically without rebooting. Is that right ? - Without using docker-compose, I'm trying to run
redis, (api)server, andworkerscontainer on different cloud instances, and scale the instances where theworkerscontainer depends on request. Given that adjusting the number of ‘workers’ containers in #221 using the--scaleoption of docker-compsose, it seems possible to increase the number of instances in which the ‘worker’ container is running without interruption. But even if it's scale-down, can theservernode (container) can detect it and allocate the job properly? To find out this problem, I looked at what items are stored on redis based on the connection status of containers in real time through RedisInsight, and it seems that the information on theworkernode is stored in redis and the information doesn’t seem to disappear in redis even if the node is turned off, so I asked you a question.
Sorry to bothering you.
Have a good day!
As the author has not responded yet, can you please do a small test, where you have 2 worker instances and one API instance, now you kill one machine while bombarding the API with submissions. Give appropriate timeout. This way, we can conclude if there is a reque logic inside the API and redis image or it has to be added as another layer, which will handle requeue in case the submission has not received a result in x number for seconds, let us take that 10 seconds. I also plan to try this, but my plate is full of tasks right now. Do inform us if you are able to test this out.