crossbeam-utils Issue #26: Scope stats on running, completed and panicked threads

I implemented the three method signatures suggested in issue #26 using three atomic counters which are updated each time a thread is spawned or exits using a guard/sentinel struct. I also implemented two very basic test to be somewhat able to ensure their functionality.

Jul 25 '18 09:07 oliver-giersch

I just wonder -- are stats necessary to be very precise? I think the use case in #26 only requires that the stats are eventually consistent, i.e. the stats will be saturated.

Jul 25 '18 12:07 jeehoonkang

Maybe not - perhaps I'm just being overly pedantic. :) But the solution proposed above would be fairly simple and consistent.

Jul 25 '18 12:07 ghost

@oliver-giersch First of all, thanks for the PR! It's very fast :)

I just want to say I hope stats are implemented as an add-on, rather than by default. Because maintaining stats costs a little bit, while very small, so I'd like users can opt-out.

Jul 25 '18 13:07 jeehoonkang

@jeehoonkang I don't think stats have any measurable cost. :)

Spawning a thread is a very costly operation by itself (we also allocate some memory for the destructor chain and so on...) so the cost of three atomic operations per spawned thread is basically zero.

Jul 25 '18 13:07 ghost

@jeehoonkang coincidentally I had been doing some reading of the scoped thread code for the last two days to get a better understanding, so when I saw the issue I just went ahead and tried my hands on it ;)

@stjepang Thanks for the reply, I am not too experienced in this subject the subtleties of parallelism still often elude me.

I could go ahead and drop the sentinel struct in favor of a catch_unwind() approach, not sure what the benefits are.

I also had the following solution for true consistent atomic stats in mind (not tested yet):

#[derive(Default)]
struct AtomicScopeStats(AtomicUsize);

#[cfg(target_pointer_width = "64")]
impl AtomicScopeStats {
    const COUNTER_SIZE: usize = 21;

    const RUNNING_MASK   : usize = 0x00000000001FFFFF;
    const COMPLETED_MASK : usize = 0x000003FFFFE00000;
    const PANICKED_MASK  : usize = 0x7FFFFC0000000000;

    const INC_RUNNING    : usize = 0x0000000000000001;
    const INC_COMPLETED  : usize = 0x0000000000200000;
    const INC_PANICKED   : usize = 0x0000040000000000;
}

impl AtomicScopeStats {
    #[inline]
    pub fn running_count(&self) -> usize {
        self.0.load(Ordering::SeqCst) & Self::RUNNING_MASK
    }

    #[inline]
    pub fn completed_count(&self) -> usize {
        self.0.load(Ordering::SeqCst) & Self::COMPLETED_MASK
    }

    #[inline]
    pub fn panicked_count(&self) -> usize {
        self.0.load(Ordering::SeqCst) & Self::PANICKED_MASK
    }

    #[inline]
    pub fn inc_running_count(&self) {
        self.0.fetch_add(Self::INC_RUNNING, Ordering::SeqCst);
    }

    #[inline]
    pub fn update_completed_count(&self) {
        self.0.fetch_add(Self::INC_COMPLETED - Self::INC_RUNNING, Ordering::SeqCst);
    }

    #[inline]
    pub fn update_panicked_count(&self) {
        self.0.fetch_add(Self::INC_PANICKED - Self::INC_RUNNING, Ordering::SeqCst);
    }
}

This has the obvious issue of not being portable, and the limitation of pointer-width / 3 might be to restrictive on non-64 bit architectures. In that case I would assume a RwLock would be preferable over the solution with the loop, no?

Also, I do have some questions regarding a bunch of details the code for the scoped thread, would it be OK if I sent you an email or something?

Jul 25 '18 14:07 oliver-giersch

I could go ahead and drop the sentinel struct in favor of a catch_unwind() approach, not sure what the benefits are.

The benefit is somewhat simpler code, but in the end it's the same thing.

I also had the following solution for true consistent atomic stats in mind (not tested yet):

This will work, except the result needs to be shifted right in fn completed_count and fn panicked_count. But as you say, this isn't portable and there might not be enough bits to squeeze three numbers into one atomic usize. Also it's kind of messy.

Mutexes are fine, but polling counters (using the three public methods) might be a bit slower than necessary due to locking. If we want consistency, I think it's best to just implement fn running_threads like in this comment.

Also, I do have some questions regarding a bunch of details the code for the scoped thread, would it be OK if I sent you an email or something?

You can open an issue here, hop into #crossbeam at irc.mozilla.org, or send me an email. Whatever works for you. :)

Jul 25 '18 15:07 ghost

OK will change these things then accordingly, probably tomorrow (and yeah, I forgot the right shift).

I'll also open a ticket for my questions/suggestions so others may chime in.

Jul 25 '18 16:07 oliver-giersch

Why close this PR? I'm OK with merging the current changes. :)

Jul 26 '18 21:07 ghost

Huh, I must have closed it by accident

Jul 26 '18 21:07 oliver-giersch

I see. Can you restore the deleted branch and reopen the PR?

Jul 26 '18 22:07 ghost

I'll See what I can do tomorrow.

Jul 26 '18 22:07 oliver-giersch

I'm concerned with the generality of the stats. Different people have different requirements, and I think it's a little bit hard to justify a particular choice of stats. For example. I think it's enough to provide a querying method to check if there are any panicked threads, and based on it, you can implement the feature #26 wants (and a part of what this PR wants, too). In that case, I'd like to make the scoped thread library kinda minimal and provides other functionalities (such as stats) as add-ons.

Jul 27 '18 12:07 jeehoonkang

Also, I'd just like to remind that currently it isn't even possible to access &Scope from inside a scoped thread (the borrow checker rejects such code). Maybe we should solve that problem first. :)

Jul 27 '18 13:07 ghost

I think I'll try to rewrite the code to allow for opt-in stats keeping. I have something in mind like iterator adapters, so you turn the regular scope into a stat keeping scope which exposes the same functionality as a scope but with stats.

Also, I do have some concerns about the current running_threads() implementation with the loop, because I am not sure if there might be any (unlikely/rare) circumstances where the loop doesn't terminate. Also it seems a little icky. Maybe this stat could also be discarded, as I have no real idea what a use case for this stat would be.

@stjepang Also, I'd just like to remind that currently it isn't even possible to access &Scope from inside a scoped thread (the borrow checker rejects such code). Maybe we should solve that problem first. :)

This probably another issue altogether and would require Scope to implement Send + Sync, but should be worth considering, though I am not entirely sure why you would want to use a scope in a spawned scoped thread instead of creating a new (inner) scope.

Jul 30 '18 09:07 oliver-giersch

Just for comparison, the threadpool crate has similar methods for statistics: queued_count, active_count, panic_count.

The only two uses of panic_count I could find are:

https://github.com/GaiaWorld/pi_base/blob/60583c268d18b63a53f913fdc35456758c12296e/src/worker_pool.rs#L23
https://github.com/LianYun/tendor/blob/ad554d37709643af562fca81ec2ef67c3f6c1cd7/spider/src/f6ot/spider.rs#L209

Jul 30 '18 12:07 ghost

though I am not entirely sure why you would want to use a scope in a spawned scoped thread instead of creating a new (inner) scope.

If you create a new inner scope, then all threads spawned inside it will be joined before the inner scope ends. Some users would like to spawn such inner threads without this restriction.

Jul 30 '18 12:07 ghost

Summary of the latest commits:

stat counting is now entirely opt-in and separate in its own submodule
the code is entirely separate from the rest of the scoped thread module as well
the scope can keep track of spawned, running, completed and panicked threads

I have decided against the loop based solution for the running count for now, because I didn't see what purpose it is meant to achieve. If the goal is to return the running count at the exact moment the function is called, a mutex for the entire ScopeStats struct seems to be the only really valid solution, the loop does not really achieve a more accurate count than a simple substraction, IMHO. I'm willing to discuss this more, however.

Sep 25 '18 19:09 oliver-giersch

crossbeam-utils crossbeam-utils copied to clipboard

Issue #26: Scope stats on running, completed and panicked threads

crossbeam-utils
crossbeam-utils copied to clipboard