marten icon indicating copy to clipboard operation
marten copied to clipboard

Zero Downtime Projection Rebuilds

Open jeremydmiller opened this issue 3 years ago • 3 comments

This is a big request, and there's plenty to talk about.

Some thoughts:

  • I think we're gonna have to introduce some sort of projection versioning and tracking projection versions in the database
  • Needs to be triggered as part of the normal deployment. Like maybe Marten on startup can determine there's a new version of a projection, and trigger a rebuild if it hasn't already been rebuilt.
  • If we can specify that the Marten user has rights to create new schema objects, we could rebuild the projection in parallel with the current app in a different schema, then copy the data over to the main schema when the rebuilding projection gets close.
  • Probably dependent upon some of the fancier leader election work in the async daemon
  • For aggregate projections, maybe this could be sped up by building aggregate by aggregate to minimize the number of database reads and writes compared to just reading the events in the order they come in.
  • For aggregations across streams, maybe you try to label events upfront as to what aggregate they belong to, then rebuild by aggregation

jeremydmiller avatar Jan 23 '21 19:01 jeremydmiller

From my perspective and field experience with mission critical enterprise applications, the following should be possible:

  • Rebuild all projections at once (helpful for non-prod environments)
  • Rebuild individual projections
  • Rebuild of a projection should happen in a background process whose resource consumption can be controlled/limited
  • If using the technique of creating a temporary new projection in parallel to the old existing one (which seems reasonable), then this should have no side effects on the existing (old) projection
  • switching out the old for the new projection should be a simple "atomic" operation causing no downtime. E.g. think of a rolling update of the service consuming the projection. While the old versions of the service are still consuming from the old projection, the new instances of the service should already consume from the new projection
  • Account for the possibility of massive amounts of events in the event store
  • Maybe some filtering of the events used for the rebuild should be possible (e.g. do not consume archived streams, etc.)

gnschenker avatar Oct 26 '22 06:10 gnschenker

I just ran into this existing issue while pondering what happens when you startup an application while projections aren't fully rebuilt yet. For inline projects it would be great if Marten could somehow check during startup that it has to either rebuild or not rebuild a projection and could delay startup until the rebuild has happened. Manually tracking which projections require a rebuild is sometimes difficult. The current solution I have is to have a versioning table within the database and manually apply versioning to my projections.

Blackclaws avatar Dec 12 '23 11:12 Blackclaws

I'm declaring success for now

jeremydmiller avatar Feb 20 '24 15:02 jeremydmiller