dagu icon indicating copy to clipboard operation
dagu copied to clipboard

feature request: foreach

Open yottahmd opened this issue 1 year ago • 2 comments

I'd like to add a feature that splits workflow execution into file-level units. By specifying a path in the foreach field of a step, subworkflow will be generated for each file matching that path. This will enable re-execution at the file level and simplify workflow management.

Example:

steps:
  - name: clean
    command: rm ./intermediate/*
  - name: create files
    command: create.py -o ./intermediate
    depends:
      - clean
  - name: process file
    foreach:
      files: ./intermediate/*
    command: process.py $FILE
    depends:
      - create files

call subworkflow:

steps:
  - name: clean
    command: rm ./intermediate/*
  - name: create files
    command: create.py -o ./intermediate
    depends:
      - clean
  - name: process file
    foreach:
      files: ./intermediate/*
    command: run foo-workflow --params="$FILE"
    depends:
      - create files

yottahmd avatar Apr 08 '24 11:04 yottahmd

Hi !@yohamta Can I try to work on this issue ? Q : I want to know if the $FILE value in command: run foo-workflow --params="$FILE" is the file path of all files that meet the conditions in files: ./intermediate/*. Q: Does the entire feture mean that for each file that meets the conditions specified in:

foreach:
files: ./intermediate/*
command: run foo-workflow --params="$FILE"
depends:
- create files

each file will execute the command once?

liooooo29 avatar Aug 20 '24 08:08 liooooo29

Hi @halalala222, thank you very much for the interest again! Sorry, but this might be another half-baked issue as there're more things to consider, such as visualization, how to implement retry (part of the foreach items), persistence, and parallel-execution of the sub-workflow. I think that we have to make extensive changes to implement this feature. But thank you again for your interest and contributions!

yottahmd avatar Aug 20 '24 11:08 yottahmd