squared icon indicating copy to clipboard operation
squared copied to clipboard

Detect recurring pipelines

Open pnadolny13 opened this issue 2 years ago • 1 comments

As part of the PxP deep dive I explored the churned projects and whether they had recurring pipelines, this would allow us to differentiate locally exploratory users vs users that were in prod running on a recurring basis that have churned.

Attempt to tag a pipeline as recurring or scheduled based on these and potentially others:

  • executed >x times a week for at least 2 weeks
  • the pipeline is part of a schedule detected by a schedule run event, which only applies to Airflow dag generator users.
  • if theres a >7? day gap between executions then it would stop being a recurring pipeline. Do we want this? We might want a pipeline to be tagged as recurring then never be untagged, if a project churns then we would have untagged all their recurring pipelines which would make it hard to analyze their profile. Maybe we add a recurring and an is_active flag to solve that.

pnadolny13 avatar Aug 10 '22 14:08 pnadolny13

Related to https://github.com/meltano/squared/issues/296

pnadolny13 avatar Aug 23 '22 14:08 pnadolny13