cirrus icon indicating copy to clipboard operation
cirrus copied to clipboard

Barrier for PSSparseTask poll threads

Open andrewmzhang opened this issue 6 years ago • 1 comments

The 0th poll thread needs to start last. We need a barrier to ensure this, or else for large numbers of poll threads, parameter server is likely to fail.

andrewmzhang avatar Sep 12 '18 04:09 andrewmzhang

Relevant Line

We currently have barriers, but they're not implemented quite correctly. We currently have all threads waiting just before the loop, then entering the loop together. We need to wait for all threads to enter the loop before thread 0 is allowed to enter zero.

If a thread enters the loop before thread with id 0, that thread will be unable to realize socket connection (client code can open a socket, the socket will connect, but the thread will realize there is a socket with data)

I'll try to bundle this into the multiple PS PR.

andrewmzhang avatar Oct 18 '18 13:10 andrewmzhang