lightning icon indicating copy to clipboard operation
lightning copied to clipboard

Workflow Snapshotting

Open stuartc opened this issue 1 year ago • 2 comments

Goals

Whenever a workflow is changed, those changes are recorded. At the point the workflow is run, that particular snapshot is associated.

We want to always have the Workflow and its dependants (Job, Trigger etc) to always be accessible via their original UUIDs, defaulting to the most recent version of the Workflow.

However, when looking at a Run that was executed a while ago - any artefact it uses must be maintained; for example if a Job was deleted or renamed - that should be what is presented in lists/tables/history.

Special attention must be given to the relationship between Jobs and Credentials as Credentials are changeable outside of Jobs - and may be associated with more than one Job (or even different workflows).

Design 1

Using database triggers and transaction tracking we capture all operations against any workflow related table. Once the transaction has been committed we locate all recorded changes to any models during the same transaction.

Using the captured changes we construct a map (or struct) resembling the WorkflowParams (if it differs that's not an issue), and persist that to a new model called WorkflowVersion.

This new model replaces the association with WorkOrders:

  • workflow_id -> workflow_version_id
  • trigger_id -> points at a trigger inside the version body

Attempts/Runs have similar changes:

  • starting_trigger_id -> points at a trigger inside the version body
  • starting_job_id -> points at a job inside the version body

Changes for Steps:

  • job_id -> points at a job inside the version body
  • credential_id -> point at a versioned credential

image

By persisting the Workflow as a 'complete' version (i.e. a map/embedded schema), we lose the ability for maintaining referential integrity between items inside a workflow and their dependants such as Job belongs to Attempt/Run. This particular association (if broken) does not break the system in ways that effects users immediately, it may be a problem that should be dealt with carefully.

This design still needs to have it's integration points mapped out, we need to find out how much parts of the application need to change in order to facilitate this.

Design 2

image

Using a concept of Snapshots and Revisions where a Revision is a 'version' of a given object like a Job or a Workflow and a Snapshot is grouping of a given set of Revisions we:

  • Alleviate concerns about duplication since a snapshot only creates a new revision for the object that changed.
  • We maintain the upside of design Design 1 where each snapshot is contained.
  • We keep the ability to use foreign keys for referential integrity
  • It fits will with Ecto.
  • There will be more effort however in querying and inserts because we will rely on join tables for every object.
  • Is (potentially unnecessarily) more featureful, in that all revisions to any objects can be traced back to it's original use.
    • i.e. if a Job was changed on snapshot 1, and we're on snapshot 45 now - we can tell that the job has been used/not modified since them.
    • this level of granularity is a side-effect of this design that may not be necessary.

This design still needs more thought put into the database schema.

Issues

  • [x] #1772
  • [x] #1821
  • [x] #1822
  • [x] #1823
  • [x] #1826
  • [x] #1824
  • [x] #1825
  • [x] #1843
  • [ ] #1827
  • [ ] #1832
  • [ ] #2239

stuartc avatar Jan 29 '24 10:01 stuartc

@aleksa-krolls I noticed you completed this issue but there are a few issues pending. Was it a mistake?

christad92 avatar May 08 '24 11:05 christad92

@christad92 sorry! didn't mean to do that... I was cleaning up old epics on the Zenhub board to clear my view there, and must have inadvertently closed some issues by trying to remove the epics from my Zenhub view. Will be really careful not to do that

aleksa-krolls avatar May 08 '24 11:05 aleksa-krolls