nox icon indicating copy to clipboard operation
nox copied to clipboard

Warm services watchdog

Open folex opened this issue 2 years ago • 0 comments

Motivation

Scalability of services-per-node can be achieved by dynamic unloading (pausing) services by some criteria, and then loading (waking up) on demand.

Implementation

Pausing is basically removing the service without removing its disk state.

Unpausing is starting the service in the same way as the new one.

  • Paused service queue: paused services must retain calls sent to them before they're woken
  • get_interface should work but not wake up the service
  • get service state: paused / awake / non existent
  • some services must be marked as unpausable (e.g., builtins, services that are used by scheduled scripts?)

Notes

  • Services must be ready to node reboot, so they must be ready to pause/unpause.

TODOs

  • Explain in documentation that services should be ready to be restarted
  • We need to measure how long it takes to unpause/start the service to understand if it's usable
    • Most likely we'll need to cache WASM code compiled by Cranelift

folex avatar Mar 22 '22 12:03 folex