osiris icon indicating copy to clipboard operation
osiris copied to clipboard

Support "competing" readers

Open kjnilsson opened this issue 4 years ago • 10 comments

Currently all readers read the entire log and there is no mechanism for "competing" reads, i.e. where multiple readers read entries/chunks in round-robin order, i.e. in order to increase the speed at which entries for a given stream are processed.

A possible design for doing competing reads:

  • A competing read coordinator (CRC) process is co-hosted with the osiris writer (leader)
  • The CRC acts like a reader although it only ever scans the chunk headers.
  • Readers that want to "compete" (collaborate may be a better term) over a log register with the CRC and wait for the CRC to inform them of the next chunk id (offset) to read from.
  • The CRC will thus allocate chunk ids to competing reads and maintain state of the current readers, what chunk ids they have been allocated.
  • Readers will ack back when they are finished processing a chunk id and so that they can be allocated another chunk id (some degree of pipelining should be be allowed).
  • The CRC persists the current read state in the log as a special entry type so that it can be replicated recovered from anywhere
  • Thus chunk ids are allocated to available readers in a round-robin-ish manner allowing.
  • Although allocation of chunk ids happen on the writer node the reads can do the actual reads on replica nodes which will further scale out reads across the cluster.

Downsides:

  • Ordering is gone as if a reader fails whilst being allocated a chunk id this chunk id needs to be given to an existing reader which may already have processed a higher chunk id, resulting in this reader having read chunks out of order. That said for competing consumes this is always the case.

kjnilsson avatar Apr 23 '20 12:04 kjnilsson

The above looks like an elegant design.

This leaves open the question of consumer offset tracking.

Without competing consumers, manually managed consumer offsets are easy. Just periodically write the last offset to some kind of persistent store. With competing consumers, this becomes a more tricky problem to get right as it is not a single offset that represents a high watermark of where consumption has reached.

When we start offering built-in offset tracking, again it becomes more complex.

Vanlightly avatar Apr 23 '20 13:04 Vanlightly

I don't think a competing consumer can do offset tracking, the offset tracking is done by the read coordinator so new consumers just join the round-robin queue and are advised of the next chunk id to read

kjnilsson avatar Apr 23 '20 14:04 kjnilsson

It all sounds reasonable 👍 from me.

gerhard avatar Apr 23 '20 16:04 gerhard

👍 to "collaborate"

lukebakken avatar Apr 23 '20 20:04 lukebakken

  • Readers will ack back when they are finished processing a chunk id and so that they can be allocated another chunk id (some degree of pipelining should be be allowed).

How will it translate in term of API for reading client (e.g. the stream plugin)? Right now they send_file or register_offset_listener to get notified there's something new. I would expect this would be transparent for them as the reader/CRC would control which chunk they are supposed to send.

acogoluegnes avatar Apr 30 '20 13:04 acogoluegnes

  • The CRC persists the current read state in the log as a special entry type so that it can be replicated recovered from anywhere

OK, so competing readers get offset tracking for free? This is not supported yet for traditional readers, but when it will, they will have to issue a command (commit?) and will expect the broker to keep the offset where they left off. Are we on the same page for this?

acogoluegnes avatar Apr 30 '20 13:04 acogoluegnes

What about replay semantics? Can a group of competing consumers can start over? This implies some kind of deletion concept for the group, or at least offset reset for this group.

acogoluegnes avatar Apr 30 '20 13:04 acogoluegnes

  • Readers will ack back when they are finished processing a chunk id and so that they can be allocated another chunk id (some degree of pipelining should be be allowed).

How will it translate in term of API for reading client (e.g. the stream plugin)? Right now they send_file or register_offset_listener to get notified there's something new. I would expect this would be transparent for them as the reader/CRC would control which chunk they are supposed to send.

Yes we should hide this complexity (choose which process to send to) behind a common api.

kjnilsson avatar Apr 30 '20 13:04 kjnilsson

OK, so competing readers get offset tracking for free? This is not supported yet for traditional readers, but when it will, they will have to issue a command (commit?) and will expect the broker to keep the offset where they left off. Are we on the same page for this?

Yes but I could imagine considering using the same process for keeping consumer offsets for this stream. Might need to ponder that a bit. :)

kjnilsson avatar Apr 30 '20 13:04 kjnilsson

What about replay semantics? Can a group of competing consumers can start over? This implies some kind of deletion concept for the group, or at least offset reset for this group.

No I don't think replay would be supported for a competing reader group, you can of course replay if you like as a "normal" reader, i.e. not competing.

kjnilsson avatar Apr 30 '20 14:04 kjnilsson