autoscaling
autoscaling copied to clipboard
move qmp handling to neonvm-runner
Problem description / Motivation
As discussed here.
There's a few reasons for this:
- neonvm-controller sleeps during reconcile while waiting for QEMU; we'd like to avoid sleeps there
- Using QEMU events subscription would be hard (but not impossible) in neonvm-controller because of the execution & data model, but we can make it easy for neonvm-runner. See also: #327
- Exposing QMP port cluster-wide is a potential security hole. See also: #414
Feature idea(s) / DoD
QMP is inaccessible outside the runner pod; neonvm-runner is exclusively responsible for making the CPU/memory changes that the controller requests (+ starting migration?).
Implementation ideas
Some considerations to be made w.r.t. #738, if we end up merging that PR. But general idea should be to expose some http server that handles:
- Returning current CPU/memory
- Changing CPU/memory to desired values
This requires bumping the "runner version" to handle that.
We'd probably also end up getting rid of the QMP port from the VM spec — special care needs to be taken to gradually phase that out (+ make sure cplane doesn't set that for new VMs).