sledge-serverless-framework RUNTIME_SIGALRM_HANDLER_TRIAGED causes missed epoll updates

RUNTIME_SIGALRM_HANDLER_TRIAGED causes missed epoll updates

Open bushidocodes opened this issue 3 years ago • 1 comments

RUNTIME_SIGALRM_HANDLER_TRIAGED is a variant of the EDF scheduler that attempts to triage which workers a might actually preempt based on deadline in order to reduce spurious SIGARLMs. However, now that the SIGALRM handler is also responsible for checking the thread local epoll for data, this optimization might result in sandboxes blocking unexpectedly long.

A possible solution is to add a runtime array of the number of sandboxes in the blocked state per worker. If a worker has a non-zero value, it should always be forwarded the SIGALRM such that it checks its epoll handler each quantum.

May 19 '21 01:05 bushidocodes

The proposed solution sounds promising, but what if a worker has a NZ value (hence there's a blocked sandbox at that worker) AND meanwhile the worker picked another sandbox from its local queue (say the global head was further in time) and began executing that sandbox, then should the worker really be forwarded the SIGALRM? Just thinking out loud...

May 25 '21 12:05 emil916

sledge-serverless-framework sledge-serverless-framework copied to clipboard

RUNTIME_SIGALRM_HANDLER_TRIAGED causes missed epoll updates

sledge-serverless-framework
sledge-serverless-framework copied to clipboard