Neos.EventSourcing WIP: Snapshot create and restore support for projections

[x] interface for createSnapshot() and restoreSnapshot()
[x] CLI commands for create snapshot and restore snapshot
[x] trait for Doctrine based Projectors helping with creating and restoring of snapshots
[ ] persist Snapshot objects so that ProjectionManager and others can look up which snapshots (should) exist and retrieve metadata
[ ] implemente deleting of snapshots
[ ] provide unit tests for all important parts

Ideas for later:

automatically create snapshots after each x events, configurable per projection
automatic clean up of old / not needed snapshots
create / restore snapshots for multiple projections at once
dump snapshots to and restore from file system / cloud storage, for quick disaster recovery

Aug 21 '20 12:08 robertlemke

Instructions:

choose a Doctrine-based projector
add implements SnapshotAwareProjectorInterface
implement the getTableNames() method
adjust the hardcoded Snapshot object in ProjectionManager->restoreSnapshot()
./flow projection:replay --maximum-sequence-number xxx yourprojection
./flow projection:createsnapshot yourprojection
see the new table(s) in the database, prefixed with "snap_"
./flow projection:catchup
check that your projection is up to date. See also the applied events table
./flow projection:restoresnapshot uuid-of-your-snapshot
your projection should be back to the xx sequence number

Aug 21 '20 12:08 robertlemke

Sorry for being the sceptic voice.. I probably over-react a little but I removed a lot of code from this package in order to keep it focused and scaleable so please bear with me and hear me out:

In event sourcing snapshotting is usually applied to aggregates as performance optimization, i.e. in order to speed up reconstitution that is needed for every mutation. It adds a layer of complexity and potential bugs though, so it is generally advised to only be used if there are no better alternatives. From http://cqrs.nu/Faq/event-sourcing:

An optimization where a snapshot of the aggregate's state is also saved (conceptually) in the event queue every so often, so that event application can start from the snapshot instead of from scratch. This can speed things up. [...]

Snapshotting has a number of drawbacks related to re-introducing current state in the database. Rather than assume you will need it, start without snapshotting, and add it only after profiling shows you that it will help.

Having said that I still think that it can be quite a useful tool but I wonder if the framework is a good place for the implementation. I.e. it's totally possible to implement aggregate snapshots today like:

<?php
class SomeAggregateRepository {

  public function load(SomeAggregateId $id): SomeAggregate
  {
    $snapshot = $this->loadSnapshot($id);
    $eventStream = $this->eventStore->load($this->streamName($id), $snapshot->sequenceNumber());
    return SomeAggregate::reconstituteFromSnapshotAndEventStream($snapshot, $eventStream);
  }

  public function save(SomeAggregate $aggregate): void
  {
    $this->eventStore->commit($this->streamName($id), $aggregate->pullUncommittedEvents(), $aggregate->getReconstitutionVersion());
    if ($this->requiresSnapshot($aggregate)) {
      $this->saveSnapshot($aggregate);
    }
  }

  // ...
}

We should add a proper example to the documentation and/or provide a base implementation, but IMO this could be done in a separate package because most of the code will be specific to the aggregate implementation.

Now this PR is about snapshots for projections and – if I get it right – it serves a different scenario. Maybe you thought it all through but I got the feeling that we might need to discuss the scenario a bit more before we can agree on a solution.

A snapshot as described above doesn't make sense for projections IMO because the events are usually only applied once and the projection is only replayed if its implementation changed (which renders any previous snapshot invalid most probably). From looking at the code (and from what we discussed the other day) I think, this one is more about a backup mechanism that allows to restore a previous projection state for debugging purposes.

With #269 this should already be possible via

./flow projection:replay projection --maximumSequenceNumber 123

and this PR is "merely" there to speed up that process, right?

Aug 21 '20 14:08 bwaidelich

Sorry if the above didn't sound constructive, I was mainly trying to turn my initial thoughts into writing.

TL;DR: I'm not sure about the new interfaces and traits and would prefer a first implementation in the affected model. E.g. for the event-sourced Content Repository two new CLI commands

./flow contentsubgraph:backup
./flow contentsubgraph:restore

..at least until we learned more about it :)

Aug 21 '20 15:08 bwaidelich

@bwaidelich that were exactly my thoughts when I saw the PR and thought about it :)

Aug 23 '20 12:08 albe

@bwaidelich thanks for your comments. I guess it's mostly a misunderstanding.

The motivation for this feature is mostly an optimization for debugging projections. It can be implemented as a separate package, if you like. It has nothing to do and does not aim for providing something like snapshots for aggregates.

Consider having an event store full of 1.000.000 events and trying to debug / improve a specific projection. You'll end up in a replay - check database - catch up - adjust code - replay -loop. In my actual case a replay of the respective projection takes a few minutes. Now, if I could create a snapshot at event 990.000 and reset to / catch from there, this would be a big time-saver.

If that is a too exotic feature, just let me know. @skurfuerst asked me to implement it for the ESCR development.

Aug 24 '20 07:08 robertlemke

diff --git a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php
index 57b5196..80b6429 100644
--- a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php
+++ b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Projection/GraphProjector.php
@@ -37,6 +37,8 @@ use Neos\EventSourcedContentRepository\Domain\Context\NodeAggregate\Event\RootNo
 use Neos\EventSourcedContentRepository\Domain\Context\NodeAggregate\NodeAggregateClassification;
 use Neos\EventSourcedContentRepository\Domain\ValueObject\SerializedPropertyValues;
 use Neos\EventSourcedContentRepository\Infrastructure\Projection\AbstractProcessedEventsAwareProjector;
+use Neos\EventSourcing\Projection\DoctrineProjectorSnapshotTrait;
+use Neos\EventSourcing\Projection\SnapshotAwareProjectorInterface;
 use Neos\Flow\Annotations as Flow;

 /**
@@ -44,11 +46,12 @@ use Neos\Flow\Annotations as Flow;
  *
  * @Flow\Scope("singleton")
  */
-class GraphProjector extends AbstractProcessedEventsAwareProjector
+class GraphProjector extends AbstractProcessedEventsAwareProjector implements SnapshotAwareProjectorInterface
 {
     use RestrictionRelations;
     use NodeRemoval;
     use NodeMove;
+    use DoctrineProjectorSnapshotTrait;

     const RELATION_DEFAULT_OFFSET = 128;

@@ -58,6 +61,16 @@ class GraphProjector extends AbstractProcessedEventsAwareProjector
      */
     protected $projectionContentGraph;

+    protected static function getTableNames(): array
+    {
+        return [
+            'neos_contentgraph_node',
+            'neos_contentgraph_hierarchyrelation',
+            'neos_contentgraph_referencerelation',
+            'neos_contentgraph_restrictionrelation'
+        ];
+    }
+
     /**
      * @throws \Throwable
      */
diff --git a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php
index 1b247b1..b61dbd1 100644
--- a/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php
+++ b/Neos.ContentGraph.DoctrineDbalAdapter/Classes/Domain/Repository/ContentGraph.php
@@ -169,7 +169,7 @@ final class ContentGraph implements ContentGraphInterface
      */
     public function findNodeAggregateByIdentifier(
         ContentStreamIdentifier $contentStreamIdentifier,
-        NodeAggregateIdentifier $nodeAggregateIdentifier
+            NodeAggregateIdentifier $nodeAggregateIdentifier
     ): ?NodeAggregate {
         $connection = $this->client->getConnection();

Jan 19 '21 12:01 skurfuerst

This one got stuck after I shared my concerns. Now, almost one year later, I still think that "snapshots" should be avoided if possible (and apparently I'm not the only one, see https://twitter.com/mat_mcloughlin/status/1418191418309451780) and certainly don't be a part of the core functionality.

Having said that, of course that's just my personal opinion and – more importantly – it doesn't mean that we should not make debugging large projections easier and faster! I'm just convinced that it is much easier to solve this for a concrete scenario rather than with a generic pattern.

btw: I shared the twitter link above not to "prove my point" (and it doesn't for all cases) but it contains an interesting idea: How can we prevent (too) long living streams?

Jul 23 '21 14:07 bwaidelich