truffleruby icon indicating copy to clipboard operation
truffleruby copied to clipboard

A way to mark C extensions as thread-safe, so they can be executed in parallel

Open eregon opened this issue 5 years ago • 10 comments

TruffleRuby supports C extensions, but for scalability it is important to run at least some of them in parallel (e.g., HTTP parsing in Puma). This was notably mentioned in my RubyKaigi talk. So we need a way to mark a given C extension as thread-safe, so we know we do not need to acquire a global lock when executing its methods. More details on the MRI tracker: https://bugs.ruby-lang.org/issues/17307

Currently there is the experimental option --cexts-lock=false but that's global so safe only if we know all the extensions used are thread-safe.

eregon avatar Nov 05 '20 11:11 eregon

There are two main places where we would need to fix things, when creating the stubs that wrap C methods, and when making calls into the Ruby runtime which would release and reacquire the lock. I think we can do this through a macro (call it __TRUFFLERUBY_NO_MUTEX__) which can be set to 1 before including our standard headers to indicate that this extension should not use locks, but we would then have to rename a large number of standard functions based on this as they would have to behave slightly differently. The other option would be to use some thread local to indicate whether a mutex was required, but this is unlikely to be as performant as we would really like.

aardvark179 avatar Nov 10 '20 14:11 aardvark179

I was thinking the main place where we take the lock is rb_define_method and friends, and those we can simply read from a thread-local when defining functions to know whether we define a variant acquiring the lock or not. But it seems there are quite a few other usages of Primitive.call_with_c_mutex*, we should review them.

eregon avatar Nov 10 '20 16:11 eregon

Given https://bugs.ruby-lang.org/issues/17307#note-21 The way to mark C extensions as thread-safe should be:

rb_define_method(mod, "need C ext lock", func1, 0);

#ifdef HAVE_RB_EXT_THREAD_SAFE
  rb_ext_thread_safe(true);
#endif

rb_define_method(mod, "does not need C ext lock and can run in parallel", func2, 0);

#ifdef HAVE_RB_EXT_THREAD_SAFE
  rb_ext_thread_safe(false);
#endif

rb_define_method(mod, "need C ext lock", func3, 0);

eregon avatar Jul 19 '21 12:07 eregon

@ioquatix says a way to define it explicitly per method would be nice, like:

rb_define_method(mod, "does not need C ext lock and can run in parallel", func2, RB_NO_EXT_LOCK(arity));

It's more verbose if many methods, but explicit vs implicit thread-local state. Maybe we can have both.

eregon avatar Jul 19 '21 13:07 eregon

Here is a way to efficiently pack flags into args which gives us plenty of head room:

#include <limits.h>

const int MASK = 15;
const int NOGVL = 16; // space for ~26 more bit flags.

int combine(int count, int flags) {
    if (count >= 0) return count | flags;
    else return count & ~flags;
}

int count(int arity) {
    if (arity & INT_MIN) {
        return (arity | ~MASK); 
    } else {
        return (arity & MASK);
    }
}

int flags(int arity) {
    if (arity & INT_MIN) {
        return (~arity & ~MASK); 
    } else {
        return (arity & ~MASK);
    }
}

ioquatix avatar Jul 19 '21 13:07 ioquatix

The plan here is to not use the C extensions lock if RB_EXT_RACTOR_SAFE is used for an extension (already used in some extensions), or if RB_EXT_THREAD_SAFE is used (would be new and less restrictive than RB_EXT_RACTOR_SAFE)

eregon avatar Jun 16 '23 09:06 eregon

Is there any progress on this issue?

loid345 avatar Mar 19 '24 09:03 loid345