c3c Idea: Concurrency manager

Idea: Concurrency manager

Open joshring opened this issue 4 months ago • 3 comments

Inspired by the article: https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/

But if you have threads (green- or OS-level), you don’t need to do that. You can just suspend the entire thread and hop straight back to the OS or event loop without having to return from all of those functions.

Go is the language that does this most beautifully in my opinion. As soon as you do any IO operation, it just parks that goroutine and resumes any other ones that aren’t blocked on IO.

Inspired by scoped allocators:

DynamicArenaAllocator dynamic_arena;
dynamic_arena.init(1024);
mem::@scoped(&dynamic_arena)
{
    // This allocation uses the dynamic arena
    Foo* f = malloc(Foo);
};
// Release any dynamic arena memory.
dynamic_arena.destroy();

An opt-in scope with a job scheduler which can spin up and pause a "thread-like" on IO or other interrupts (?) would be really interesting

Similar to an allocator like the DynamicArenaAllocator, the job scheduler behaviour could be configurable to suit an application^**
only exists in a scope
eg how aggressively it paused jobs would determine latency, but reduce throughput how many "thread-like"s to make

Some psuedocode of network based concurrency

concurrency::AsyncIO asyncio;
concurrency::SchedulerWorkStealing work_stealing;
asyncio.init(thread_type: os::GreenThread, timeout: 500, clock_cycle_time: 22, scheduler: work_stealing, num_threads: 4 * NUM_CPU);
concurrency::@scoped(&asyncio)
{
    // Network requests here
};
asyncio.destroy();

Some psuedocode of long running jobs based concurrency

But also support things which are of a very different nature, eg high performance computing job scheduling

concurrency::JobQueue job_queue;
concurrency::SchedulerFIFO fifo_scheduler;
job_queue.init(thread_type: os::Thread, timeout: 100_000, scheduler: fifo_scheduler, num_threads: 10_000);
concurrency::@scoped(&job_queue)
{
    // HPC compute tasks here
};
job_queue.destroy();

variables ^**

latency, throughput, cross-job-synchronisation, number of jobs, job length, job scheduling

platform constraints

platform memory amount, platform memory bandwidth, platform memory access times if NUMA, platform CPU/GPU, platform network

Task constraints

CPU time limit
Memory usage limits
Time before timeout
Access to particular subsystem, eg network may be restricted completely
Access to particular subsystem may need to wait for other tasks to complete.
Cross-task communication

Cross task communication

Barriers force tasks to sync then exchange information, eg if need dependency to proceed.
Accumulate work from faster tasks into an in-memory or disk-based buffer to be processed by the slower task when ready, eg sending emails get queued into a list by an API

Task groups (just a kind of task)

Groups of tasks, where one task depends on another eg fetching details from a database to use in an API request. Expression blocks or functions might be a nice way to express this

Task priority

Task priority, some tasks are critical within a deadline, others are optional within a deadline, eg cancel the optional ones to get the critical ones done in time.
Cancelable and non-Cancelable tasks, some tasks should not be cancelled, while others are OK to cancel, based on a message sent to the task or decided centrally?
Policy on cancel, restart or do not restart, postpone for X time, exponential backoff etc
Interrupts

Fault tolerance - How to handle failed tasks

Task must have Excuse handling code
Common: Report Excuse, Repeat task, Fail task, Reach Consensus in Distributed systems.

Scheduler types

Work stealing (levelling out the workload across threads)
Realtime deterministic scheduler (deadline focused)
FIFO
LIFO
Distributed systems
NUMA system

So this abstraction should work through the range of use-cases from an embedded system, to a video game, webserver, NUMA system to a distributed system.

Oct 09 '24 12:10 joshring

c3c c3c copied to clipboard

Idea: Concurrency manager

An opt-in scope with a job scheduler which can spin up and pause a "thread-like" on IO or other interrupts (?) would be really interesting

Some psuedocode of network based concurrency

Some psuedocode of long running jobs based concurrency

variables **

platform constraints

Task constraints

Cross task communication

Task groups (just a kind of task)

Task priority

Fault tolerance - How to handle failed tasks

Scheduler types

c3c
c3c copied to clipboard

variables ^**