charm Modifies converse schedeuler to prioritize NodeGroup messages

Modifies converse schedeuler's getNextMessage so nodeGroup messages can run with higher priority over local closes #3674 As it is, nodeGroup messages are not checked until all local and regular Charm queue (prio Q) messages are checked, which cause issues when the applicaiton is using nodeGroup messages in the hope that some PE will attend to it quickly. The change makes getNextMessage check nodeGroup queue every 2^nodeGrpFreq iterations with high priority in addition to its usual check after exhasuting local queues (except task Q). This commit has not been tested at all. But pusing it to allow others to help me test/fix it.

Dec 03 '22 17:12 lvkale

One thing to worry about is whether this change causes performance degradation by making the scheduler check the nodequeue too often (depend on whether the check is expensive, even for an empty queue, because of locking). It'd be nice if someone were to run a performance regression test.

Dec 04 '22 16:12 lvkale

@ericjbohm @ericmikida @ZwFink review please.

Dec 13 '22 16:12 lvkale

I thought this was already merged. @ZwFink @ericjbohm .. will you please take a look?

Nov 01 '23 22:11 lvkale

@lvkale this is the code we discussed during the meeting today

Dec 11 '23 17:12 nilsdeppe

@lvkale this is the code we discussed during the meeting today

Sorry for the spam, but where is the meeting usually announced?

Dec 11 '23 17:12 jbakosi

Should I consider all the older unresolved comments here as acceptably resolved?

Dec 11 '23 17:12 ericjbohm

In order for this to be mergeable it should be modified to no longer be a draft and the reviewer comments should be addressed and resolved. @lvkale

Jan 17 '24 19:01 ericjbohm

One thing to worry about is whether this change causes performance degradation by making the scheduler check the nodequeue too often (depend on whether the check is expensive, even for an empty queue, because of locking). It'd be nice if someone were to run a performance regression test.

Was that performance analysis done? If so, what were the results?

Apr 17 '24 20:04 ericjbohm

One thing to worry about is whether this change causes performance degradation by making the scheduler check the nodequeue too often (depend on whether the check is expensive, even for an empty queue, because of locking). It'd be nice if someone were to run a performance regression test.

Was that performance analysis done? If so, what were the results?

Results of a benchmark that has one chare sending 10 messages to itself 500,000 times (SMP mode, higher is better). The entry method just increments a counter to track whether it should send another 10 messages to itself, so it should stress the queueing system. This result shows an overhead of $0.44$% averaged over 50 runs. This translates to the overhead of a few $ns$ per message. Any realistic, small tasks are on the order of a few $us$ per message. I think we should pay more attention to the fact that the scheduler can only churn through ~1.2million local entry methods per second than the addition of a few nanoseconds of overhead. boxplot_comparison

May 03 '24 20:05 ZwFink

Another possibly interesting point is that we run with 6 virtual nodes per physical node on our 192-cores-per-node machine because of better communication. This seems counterintuitive with the model I think Sanjay described that intranode communication shouldn't be affected by the comm thread. Maybe there's a lot of locking going on?

May 03 '24 20:05 nilsdeppe

charm charm copied to clipboard

Modifies converse schedeuler to prioritize NodeGroup messages

charm
charm copied to clipboard