tendermint icon indicating copy to clipboard operation
tendermint copied to clipboard

Tracking: PubSub performance and UX improvements

Open creachadair opened this issue 2 years ago • 1 comments

☂️ This issue tracks a handful of improvements to the Tendermint pubsub library targeting the v0.36 release.

See also RFC 006 Event Subscription and ADR 075 RPC Event Subscription.

In Scope

The overall goal of this work is to address the following performance, API, and usability concerns:

  • The API supports "unbuffered" (blocking) subscriptions, which stall the entire publisher with no timeout until serviced. This is a special case to support event indexing, but it means that indexing can stall subscriber service, and vice versa, and that feedback can stall or slow consensus.

  • Ordinary ("buffered") subscriptions are use a fixed-length Go channel as a queue, and if a client does not service its subscriptions fast enough (i.e., the buffer fills), the publisher will terminate the subscription. However, events do not arrive at an even pace, and a large bolus of events may overwhelm the channel in a very short period of time, even if a client is servicing its events optimally (see for example #6729). (Enqueues take nanoseconds or microseconds; network delivery takes milliseconds, even for fast local connections)

  • The publish/subscribe plumbing is very complicated, and tightly coupled with indexing. This is mainly a maintenance issue, but also adds overhead that interacts negatively with the stall-pushback on the rest of consensus.

Out of Scope

  • This issue does not address broader design questions: For example, as part of the pluggable indexing work (see #7135), it could make sense offload indexing and event subscription from the node process entirely. Such questions should be addressed via the ADR process.

  • This issue also does not address changes to the RPC subscription interface. That topic is covered by #7157.

Related Changes

  • Simplify and improve server concurrency handling.
  • Use a dynamic queue for subscribers to allow graceful recovery from bursts.
  • Move indexing out of the publish-subscribe pathway, and disallow unbuffered subscriptions.
  • Clean up the API to encapsulate channels and consistently plumb contexts for cancellation.
  • Performance improvements for the event query API.

creachadair avatar Oct 26 '21 05:10 creachadair

Another issue to follow up on here:

  • https://github.com/tendermint/tendermint/issues/3380

creachadair avatar Jan 04 '22 17:01 creachadair