redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

kafka: protocol: 2phase+trigger - use in read side optimization

Open emaxerrno opened this issue 4 years ago • 9 comments

emaxerrno avatar Jun 18 '21 04:06 emaxerrno

going over the 2phase+trigger for writes with noah. and i asked it about reads.

emaxerrno avatar Jun 18 '21 04:06 emaxerrno

What's the ask here?

twmb avatar Jun 24 '21 00:06 twmb

sorry this was meant for Noah from a code review.

emaxerrno avatar Jun 24 '21 08:06 emaxerrno

teh gist is that we can do a 2phase read like we do for the write.

emaxerrno avatar Jun 24 '21 08:06 emaxerrno

@mmaslankaprv did we find out about client behavior wrt multiple outstanding fetch requests?

@twmb for context: we are processing fetch requests off a connection one-at-a-time. for produce requests we have specific optimizations that take advantage of clients that keep many outstanding requests queued up on the wire. do you know what we should expect from clients regarding fetches?

dotnwat avatar Jul 12 '21 23:07 dotnwat

@dotnwat A client can't issue more than one fetch request at a time, because a client has to process the entire response so as to know what offset to ask for next.

twmb avatar Jul 13 '21 00:07 twmb

@dotnwat A client can't issue more than one fetch request at a time, because a client has to process the entire response so as to know what offset to ask for next.

makes totally sense. what about two fetch requests for two different partitions? seems as though these could be dispatched in parallel from the client. if this actually happens is another question

dotnwat avatar Jul 13 '21 00:07 dotnwat

That is theoretically possible but I don't think any client would have a reason to do that. Clients have knobs to control how much is fetched at once from a single request, so there's not much of a reason to issue two simultaneous fetch requests, because that'd basically double those knobs.

It would be possible though, but it would need to be on a new, dedicated connection, so that one request would not block the reply for the other (req A hanging for 5s while req B could have returned immediately). The same connection-splitting concept is actually valid for producing, too.

twmb avatar Jul 13 '21 00:07 twmb

excellent. I think then we don't need to worry about this but its something to keep in the back of our minds as either a possible optimization later or a source of performance issue that only might, in theory, exist.

dotnwat avatar Jul 13 '21 00:07 dotnwat