libdill icon indicating copy to clipboard operation
libdill copied to clipboard

Trying to understand Libdill

Open GuacheSuede opened this issue 5 years ago • 12 comments

does libdill pause a function after a certain amount of time has passed and come back to it later ? (eg infinite loops fns)

GuacheSuede avatar Jan 08 '19 10:01 GuacheSuede

looking at internals

GuacheSuede avatar Jan 08 '19 12:01 GuacheSuede

The explanation can be found at http://libdill.org/structured-concurrency.html

Coroutines are scheduled cooperatively. What that means is that a coroutine has to explicitly yield control of the CPU to allow a different coroutine to run. In a typical scenario, this is done transparently to the user: When a coroutine invokes a function that would block (such as msleep orchrecv), the CPU is automatically yielded. However, if a coroutine runs without calling any blocking functions, it may hold the CPU forever. For these cases, the yield function can be used to manually relinquish the CPU to other coroutines manually.

The documentation and tutorials are well written, I suggest you read through it.

rokf avatar Jan 08 '19 17:01 rokf

@rokf yup, i read that. looking more at the code aspect of it, would u happen to know which part of the code does libdill automatically trigger the yield for block functions ?

GuacheSuede avatar Jan 08 '19 20:01 GuacheSuede

Not exactly, I did not dive in that deep yet. However, after quickly skimming over the code it looks like that functionality is implemented inside the cr.c and cr.h files. Waiting, resuming and checking for blocks is all there.

rokf avatar Jan 08 '19 21:01 rokf

Look here for example: https://github.com/sustrik/libdill/blob/master/libdill.c#L39

It's implementation of msleep. It registers a timer and does dill_wait() to wait till it expires. In the meantime other coroutines are given a chance to run.

sustrik avatar Jan 09 '19 04:01 sustrik

Is it not possible to do it implicitly without dill_wait() ?

GuacheSuede avatar Jan 12 '19 17:01 GuacheSuede

What do you mean?

sustrik avatar Jan 12 '19 20:01 sustrik

Would it be possible to say, co routine multiple blocking functions and yet still have them yield without actuall writing dill_wait() ? eg, yield based on timer automatically, every 1 mins..

GuacheSuede avatar Jan 13 '19 05:01 GuacheSuede

Nope. It's cooperative multitasking. If a coroutine doesn't want to yield, there's no way to force it to do so.

sustrik avatar Jan 13 '19 09:01 sustrik

I see. I came from C++ where most libraries are provided in the std without the need to code your own unlike C. Forgive my ignorance but, would it be theoretically possible to code Preemptive Scheduling in C on the User Thread level, not pthreads ? Google search does not bring up much results.

GuacheSuede avatar Jan 13 '19 12:01 GuacheSuede

Generally, the OS provides that. Look at pthreads library.

sustrik avatar Jan 13 '19 13:01 sustrik

@GuacheSuede It is almost theoretically possible, but it will be hard, implementation-specific (you'll have to roll your own code to handle the details, and you must absolutely make sure you know what you are doing before you try to do it: it'll probably require code specific to each CPU architecture and ABI and OS, and it might be fragile to changes in each OS and in your compiler optimization settings).

Before you proceed trying to actually do it, I strongly recommend you reconsider, because:

  1. humans have to understand and verify code (for now anyway), and context switches being limited to a few explicit points in the code are far easier on human reasoning. (For a more in-depth exploration of this, read the relatively famous "unyielding" post.)

  2. given how low-level C is (there isn't any "magic" "in-between" statements, so what you write is what you get, you can't just inject context switches everywhere without basically implementing your own compiler), all preempting has to be initiated "from the outside" - by the CPU or the OS - so you'll lose the main benefit of user-space threads: avoiding the overhead of context switches bouncing between kernel-space and user-space.

Also, libdill can't help you do it, but you'll need to really carefully double-check libdill internals if you want to combine them, because it might do something you'll have to account for internally.

Now. That said.

So if you want to implement preemptive scheduling in user-space, you have to tell the system to interrupt your code somehow, and give you the opportunity to save and change your program state internally, so that when it resumes normal execution, it resumes in a different spot.

The only readily available interface for this is signals, and since you want to be preempted arbitrarily, you probably want the alarm system call (though you could also run a separate OS thread or process which does not participate in running your user-space threads, and only handles "meta" tasks like hitting your worker thread(s) running your user-space threads with signals, but note that the moment you have multiple threads everything becomes harder and less portable - there is no portable interface for making sure the signal is delivered to a specific thread - on Linux you would use the tkill and tgkill system calls, which are not exposed in the C library so you'd have to invoke them "raw").

So that gets you as far as the spontaneous preemption at any arbitrary spot on the code, which is the easiest part of this whole thing.

So you install a signal handler for SIGALRM (or whichever other signal you are using), using the sigaction call with the SA_SIGINFO flag and the three-argument sa_sigaction parameter instead of the one-argument sa_handler parameter. The third, barely-documented argument to the signal handler contains the CPU-specific and OS-specific information that the kernel uses to resume execution at the exact original spot that it was interrupted.

If you're lucky, this is a pointer to memory that your thread can write to, and is exactly the information that the kernel actually uses (a reasonable argument could be made for a security-conscious implementation disabling or ignoring overwrites of this thread execution state).

Now, on a per-CPU, per-OS basis, you can save this information (thereby saving and switching out of your "user-space thread") (bearing in mind all the constraints on what signal handlers can safely do), and then overwrite this information (with the information for another user-space thread), and return normally from the signal handler.

If you did everything right and the kernel allows it (I think Linux does, others I'm not sure about), your thread will then resume execution from a different location, with different program state.

Now you still have to manage some list (or other data structure) somewhere with the state of all currently running user-space threads which must be accessible both from your signal handler for switching, and from outside the signal handler for safe re-entrant and race-free manipulation. You'll need to manage stack space for each user-space thread, or whatever else your implementation allows or requires you to manage for your C code to run in a well-defined way.

So hopefully that is clarifying and educational. I get the desire to have preemptive scheduling in user-space threads, but C is meant to operate as a level of abstraction too raw and too close to the realities of the system to actually implement it as:

  1. "transparent" or "invisible"
  2. without the overhead of OS threads
  3. still C, as opposed to your own thing that turns C code into something subtly different

One of those must give.

mentalisttraceur avatar Feb 15 '19 23:02 mentalisttraceur