ngo
ngo copied to clipboard
[RFC] Config the number vCPUs
The problem
Since NGO has shifted to an architecture that schedules threads by itself, the implementation inherently needs to determine the number of CPU cores on which these threads can be scheduled.
So there are a few design decisions regarding with CPU cores that we need to make:
- How the user can specify the number of CPU cores?
- If the user does not give the number explicitly, what should be the default behavior?
Possible solutions
A bad answer: Occlum.json
Adding some config entries to Occlum.json is a natural choice. But the number of CPU cores available and to use is most likely to be determined at runtime, instead of at the enclave build time. And letting the untrusted runtime to give the exact number of CPU cores to use seems to be harmless. So we should not "hardcode" the number of CPU cores in Occlum.json.
Nevertheless, I think it makes sense to add two optional config entries of min_cpu_cores
and max_cpu_cores
so that we can put some constraints on the untrusted, runtime-given input of CPU core count.
Reference answers
Container runtimes like Docker and Kata face a similar problem. Let's see how they solve the problem.
Docker
The default behavior of Docker is to give a Docker container all the available CPU cores in a cgroup. This is a rational choice as processes inside a Docker container is no difference from those outside from the perspective of scheduling.
Docker did allow the users to decide the number of CPU cores. See --cpus
and --cpuset-cpus
options of docker run
command.
Kata
Implementation wise, Kata is more like a VM. As such, the number of vCPUs of a VM must be specified. There is a global config file where the default number of vCPUs can be specified. Also like Docker, the number of vCPUs can be specified in a per-container manner using command line options.
The proposed solution
Add three config entries to Occlum.json
:
{
"resource_limits": {
"min_num_of_cpus": 1,
"max_num_of_cpus": 128,
}
}
And remove the config entry of max_num_of_threads
.
And a command-line option of --cpus
. So the user can run
occlum start --cpus 4
or
occlum run --cpus 4