spline
spline copied to clipboard
Decompose lineage ingestion and search index update processes
Problem
- At the moment Consumer API relies on certain redundant information that is stored in the Progress and DataSource nodes. It makes the Producer services logically dependent on the Consumer ones, which is an undesirable dependency.
- Also, the info stored in the DataSource is transient and is updated every time a new adjacent edge is created. The update operation is not covered with the application level transaction and must be avoided.
Proposal
Decouple Consumer related parts of the lineage persistence process into a separate service according to the CQRS pattern. Tha is, the second service would listen and react on events produced by the Producer service, like "execution plan inserted", " execution event inserted" etc. The events would be sent at a very last stage of the Producer transaction and delivered in the "at-least-once" manner.
Possible implementation
ArangoDB queues