kedro-plugins icon indicating copy to clipboard operation
kedro-plugins copied to clipboard

kedro-airflow: Extend grouping strategies

Open ankatiyar opened this issue 9 months ago • 1 comments

Description

https://github.com/kedro-org/kedro/issues/3094 lists a number of pain points experienced by users while deploying their Kedro projects to MLOps platforms. Each kedro node is assigned to a task 1:1.

#241 added the --group-by-memory flag to make it possible to group nodes that share MemoryDatasets between them into one airflow task.

This ticket is to propose extending the grouping strategies offered by kedro-airflow There's some strategies we can consider -

  • by pipeline
  • by tags (https://getindata.com/blog/deploying-kedro-pipelines-gcp-composer-airflow-node-grouping-mlflow/ written by @Lasica)
  • by namespace(?)

Suggestion

  • Change the design of --group-by-memory to something like --grouping-stratergy=<nodes/pipeline/memory>/--group-by=<> to take input. This will make it easy for us to add grouping strategies in the future depending on what users actually want/need.
  • Gather user input on what grouping strategies would be useful

ankatiyar avatar May 09 '24 15:05 ankatiyar