baker icon indicating copy to clipboard operation
baker copied to clipboard

Minimalistic production deployment setup / config

Open nickvcb opened this issue 5 years ago • 4 comments

What is the recommended Akka cluster setup for deployment Baker in production if scalability was not an immediate requirement?

For discussion with @nikolakasev my understanding is

  1. For journal used to store the process events have choice between : in-memory or with a backing store like Cassandra In-memory journal implies that the process events are not persisted between runs.

"A single node is never a good idea, and you’d want at least three (majority rule)."

  1. Cluster node/JVM count: at least three (majority rule)

Please advise if correct and if there are more considerations.

/cc @SemanticBeeng

nickvcb avatar Jul 16 '19 08:07 nickvcb

In my opinion, the main 2 reasons you want a cluster is for scalability and resilience, if you don't require scalability (though you would almost get it by default because you only need to add new nodes) but you still require resilence (if nodes go down the recipe instances self-heal by respawning on another node) then you still need a backing store like Cassandra, otherwise recipe instance "rehydration" cannot happen.

If you don't need that kind of resilience even, then I would recommend just running in 1 machine with the default local configuration, so no Akka cluster and only local storage., the same configuration that you probably have for your tests.

VledicFranco avatar Jul 16 '19 09:07 VledicFranco

The Baker 3.0 documentation page (which we are working on right now) will have more about this :)

VledicFranco avatar Jul 16 '19 09:07 VledicFranco

no Akka cluster and only local storage

In the current platform am building we rely on Baker (Akka persistence) for distributed process state management :

  • resuming execution of a process from any node in the deployment
  • reading Akka persistence journal from any node with "persistence query" etc.

For that we need cluster setup, correct? In my understanding, the "single writer principle" is necessary for (used by) Baker and needs a cluster based setup.

If am wrong then please advise when a cluster becomes necessary from standpoint of distributed state management?

SemanticBeeng avatar Jul 16 '19 12:07 SemanticBeeng

You are completely correct. Hence the recommendation is to configure journaling to use Cassandra. It is the store we've used for production systems and have worked great

VledicFranco avatar Jul 24 '19 10:07 VledicFranco