blueboat icon indicating copy to clipboard operation
blueboat copied to clipboard

MemoryWatermark tune operation interferes all apps.

Open lmxia opened this issue 3 years ago • 5 comments

What happened:

Sandbox not full isolated.

What you expected to happen:

If app A cost massive memory which result in availble memory less than Critical MemoryWatermark, then the runtime will tune smr param: "MAX_WORKERS_PER_APP", that's now the logic. But the tune operation will scale in the running app B , the scale in operation is triggered by app A.

That sound like not a good isolation behavior, apps interfere each other.

lmxia avatar Jul 14 '22 09:07 lmxia

Memory watermark changes are shipped to a central system log through Kafka. The control plane is expected to monitor the log for High memory watermarks, and route new requests away from the affected instances, so most of the time the Critical watermark will not be triggered. If for some reason memory usage continues to grow, we rely on a few best-effort defenses to keep the system running in a degraded state:

  • MAX_WORKERS_PER_APP tuning
  • The OOM killer to terminate the worker that use the most memory

But indeed this is a bug in performance isolation. Currently Blueboat does not have a hard per-worker memory limit, so it is possible to trigger the Critical watermark from a single worker pretty quickly, before the control plane has time to respond.

The solution would be to add per-worker resident set size limit.

losfair avatar Jul 14 '22 12:07 losfair

Would you be interested to work on this? :)

losfair avatar Jul 14 '22 12:07 losfair

yes, sure, I woul like to that.

lmxia avatar Jul 15 '22 01:07 lmxia

Happy to review your PR!

losfair avatar Jul 15 '22 02:07 losfair

There are several approaches for implementing per-process RSS limit:

  1. cgroup: Put each worker into its own memory cgroup.

Pro: The limit is accurate. Con: May not play well with a sandboxed environment (seccomp/dropped privileges)

  1. rlimit: use RLIMIT_AS to limit the address space (VSZ) of each worker process.

Pro: Simple and plays well with OS-level sandboxing. Con: Prevents us from enabling V8 virtual memory tricks for optimizing WebAssembly memory accesses.

On the API side, memory limit should be passed to the runtime as a field in Metadata.

losfair avatar Jul 15 '22 03:07 losfair