dusk-blockchain icon indicating copy to clipboard operation
dusk-blockchain copied to clipboard

Implement priority queue on inbound messages

Open autholykos opened this issue 3 years ago • 4 comments

Describe what you want implemented A queue that sidesteps the non-determinism of golang scheduling system: https://github.com/dusk-network/dusk-blockchain/blob/f44e1055887cf2a71fd05ec346fc3461cf223d44/pkg/p2p/peer/peer.go#L340

Describe "Why" this is needed There is a strong suspicion that some consensus messages might be unlawfully delayed by the golang scheduler, potentially creating problems.

Describe alternatives you've considered Splitting the inbound messages between a priority queue and the normal scheduler as is

Additional context This might be related to the root cause of #1156

autholykos avatar Nov 04 '21 12:11 autholykos

After ad-nauseum consideration of the issue I came up with this final, simple-as-possible way to implement it:

P = priority N = non-priority

For this math, 1 is zero, because zero destroys the proportion. So this is kinda ordinals.

(P+1)/(N+1)>=2

P at 3 and N at 1 gives a result that satisfies this. So does these:

1, 0 2, 0 3, 0 4, 1 5, 2 6, 2 6, 3

When this condition is satisfied, the queue consumes a priority item. Otherwise, it consumes a non-priority item.

l0k18 avatar Nov 29 '21 15:11 l0k18

Further refinement to add a modulus based forced selection to ensure the less filled queue still gets picked on.

closes #1206

l0k18 avatar Dec 01 '21 06:12 l0k18

Current status is a simple test that demonstrates and validates the queue does not neglect non-priority items when they are being overwhelmed by priority messages.

The previous loop with unbounded spawning of goroutines to process messages has been changed into two goroutines, one which receives messages, the other which picks messages off the queue and processes them in a single thread.

The issue of keeping everything asynchronous and avoiding spinlocks and polling lead to the addition of a multi-lane semaphore that causes the consumer to block waiting for a producer to deliver a message. The semaphore channel is loaded at the same time as the message channel, and the reader waits on a semaphore to read from the queue, in this way, until there is an item on the semaphore channel there can't be a message so any call to Pop then waits until Push has been called to provide a new semaphore.

The only thing that may be needed is if the work queue should be parallelized or not. Currently it is designed to be a single thread. Making it multiple thread is a matter of adding a loop to spawn the queue consumer thread multiple times.

l0k18 avatar Dec 10 '21 18:12 l0k18

The dangling element to close this issue is that in the test the callback fails to execute, for which reason a timeout failure mode was added.

l0k18 avatar Dec 20 '21 09:12 l0k18