sys_reading
sys_reading copied to clipboard
Lachesis: A Middleware for Customizing OS Scheduling of Stream Processing Queries
http://pages.di.unipi.it/mencagli/publications/preprint-middleware-2021.pdf
https://github.com/dmpalyvos/lachesis
https://www.youtube.com/watch?v=YPMhcfSzG6A
Here is something you may find interesting:
- Lachesis Evaluation: https://github.com/dmpalyvos/lachesis-evaluation
- Lachesis 2.0, which consists of two improvements.
- Paper: Accelerating Stream Processing Queries with Congestion-aware Scheduling and Real-time Linux Threads.
- https://dl.acm.org/doi/abs/10.1145/3587135.3592202
- Code:
- https://github.com/FauFra/lachesis-mod
- https://github.com/ParaGroup/Lachesis-RT
- Paper: Accelerating Stream Processing Queries with Congestion-aware Scheduling and Real-time Linux Threads.
- Master thesis by Fausto Francesco Frasca: Evaluating Linux Kernel Mechanisms for Scheduling Streaming Queries on Multicores
- Very detailed.
- Fausto Francesco Frasca is the first author of Lachesis 2.0.
- The best way of understanding Lachesis and Lachesis 2.0.
summary
key problem
workload
single/multi streaming query.
optimization goal
xxxxx
configurations to tune
user define a scheduling policy (basically set priority for each operator thread according to some rules) through their API
use OS mechanisms (cgroup, nice) to enforce user-defined scheduling policies.
scenario
xxxxx
technique
xxxxx
dynamic workload?
xxxxx
multi-tenant?
xxxxx
implementation
xxxxx
Problem and motivation
what is the problem this paper is solving?
why is it important?
why is it challenging?
Main ideas and insights
describe the paper gist in 1-2 sentences
what is important to remember? What did we learn?
Solution description
explain how the solution work
example of scheduling policies: [ch5.1]
- (1) Queue Size (QS) [18] prioritizes operators with more input tuples in their queues, balancing such queues’ size and, in turn, the operators’ effective utilization, to achieve higher throughput at the Egresses and to lower latency.
- (2) Highest Rate (HR) [50] prioritizes “operator paths” (branches of a DAG ending at a sink) that are both productive (i.e., with operators having high selectivity) and inexpensive (low cost) with the goal of minimizing the average processing latency of all the tuples in the system.
- (3) First-Come-First-Serve (FCFS) [7] prioritizes those operators whose input tuples have spent more time in the system, with the goal of minimizing the maximum latency.
- (4) RANDOM gives operators uniformly random priorities.
Important results
describe the experimental setup
summarize the main results
Limitations and opportunities for improvement
when doesn't it work?
what assumptions does the paper make and when are they valid?
Closely related work
list of main competitors and how they differ
Follow-up research ideas (Optional)
If you were to base your next research project on this paper, what would you do?
Propose concrete ways to achieve one or more of the following:
Build a better (faster, more efficient, more user-friendly...) system to solve the same problem
Solve a generalization of the problem
Address one of the work's limitations
Solve the same problem in a different context
Solve the problem in a much larger scale
Apply the paper's methods to a different (but similar) problem
Solve a new problem created by this work