contour icon indicating copy to clipboard operation
contour copied to clipboard

Rethink the way notifications are sent to gRPC stream watchers

Open davecheney opened this issue 7 years ago • 3 comments

Currently notifying cache watchers is a manual process, we instrument the code path to defer a notification unconditionally. This is conservative and makes reasoning about notifications straight forward (we don't have e2e test coverage about version counters on caches, which reflects how many times notify has been called).

However there are downsides

  • Each call to update will trigger a stream to envoy, parse and reload on its end, this is not cheap for large cache sizes
  • This scales poorly in load testing and high update rate scenarios where lots of data is being changed in bulk.

This is the sequences of changes I propose

  1. We instrument the caches so they know if there has been a change to their contents; initially this is just a sequence counter that is ticked when there is an Add or Remove.
  2. We replace calls to notify in the per object code paths with a more general call; maybeNotify?, that is called unconditionally on the way out of the ResourceEventHandler code path. maybeNotify, as its name suggests will not notify if the internal sequence counter has not updated since the last maybeNotify.
  3. When caches can choose to not notify, we can wrap the 4 caches in a single meta maybeNotify call and move that higher up the stack to the buffer layer so the sequence is; receive notification off channel, process then call maybe notify which will notify watchers for all affected caches.
  4. We can then implement delayed notifications by running timer that starts when an update arrives at the buffer and is reset when another notification arrives. In that way we can have a small update window to batch updates if there is data in the buffer. This could probably be simulated with a call to len(buffer) after the update has been processed to see if there is another update immediately following the one we just updated.

davecheney avatar Apr 05 '18 02:04 davecheney

I plan to work on this for 0.6

davecheney avatar Apr 05 '18 02:04 davecheney

Not for 0.6, moving to the backlog.

davecheney avatar Jun 07 '18 05:06 davecheney

Related to #499. Scheduling to 0.14 to resolve by 1.0.0.beta.1

davecheney avatar Jun 18 '19 08:06 davecheney