keripy optimize how we submit to witnesses

optimize how we submit to witnesses

Open dhh1128 opened this issue 5 months ago • 2 comments

Feature request description/rationale

As I understand it, when we submit events to witnesses, we use a round-robin approach that, for N=5 witnesses, works like this:

# first loop - get initial receipts
receipts = []
for each witness:
    call witness with existing receipts, and ask for the new receipt
    add new receipt to receipts array

# second loop - share other receipts
for each witness except last:
    submit receipts from all other witnesses

This results in the following:

metric	count	explanation
requests to witnesses	9	every witness is called twice except the final witness in the first loop, which sees everybody elses receipts on the first call
bandwidth, requests to witnesses	9 * request_overhead + 20 * receipt_size	each witness ends up seeing 4 other receipts
bandwidth, responses from witnesses	9 * response_overhead + 5 * receipt_size	each witness returns one receipt but responds twice
latency	9 * request_response_roundtrip	9 calls made, one after the other
timeout	9 * reasonable_timeout_per_call	cumulative timeout

I suggest we change to the following algorithm:

call all witnesses in parallel and ask for a receipt
wait until toad have responded or until timeout
call all witnesses with all other receipts

If I am analyzing correctly, this would give us the following metrics:

metric	count	explanation
requests to witnesses	10	every witness is called twice
bandwidth, requests to witnesses	10 * request_overhead + 20 * receipt_size	each witness ends up seeing 4 other receipts
bandwidth, responses from witnesses	10 * response_overhead + 5 * receipt_size	each witness returns one receipt but responds twice
latency	2 * request_response_roundtrip	2 phases, each fully parallel
timeout	2 * reasonable_timeout_per_call	cumulative timeout

Is my analysis accurate? If so, we consume almost identical bandwidth (10 vs 9 calls, same amount of data transferred), but our latency is decreased almost by a factor of 5. This would speed up many KERI operations, as far as users perceive things.

A possible downside to this approach is that error handling gets more complicated, and also, the chances of things being in escrow goes up (unless we don't begin phase 2 until all witnesses respond or time out). And of course, running stuff in parallel is a bit more complicated. But making it noticeable faster to talk to witnesses still seems to me like it might be worth the tradeoff.

What do you think?

Aug 29 '24 20:08 dhh1128

keripy keripy copied to clipboard

optimize how we submit to witnesses

Feature request description/rationale

keripy
keripy copied to clipboard