add a command that runs a list of commands in parallel and aggregates results
Input should be a csv file with columns:
- name
- bash command to run
- output files to save
Run it with something like:
./kubeface-run-commands commands.csv --out-dir /path/to/results
Results should be written to outdir as directories for each command (named by name), with the output files for the corresponding command in each directory.
Consider
- Is there an existing standard interface for this kind of thing that should be supported? Perhaps
gnu parallel? - Other output formats that may be more convenient for common cases, like a flat directory of files if there is only one output file to collect per command
What is an example of the commands.csv @timodonnell ?
Thinking of something like this:
name,command,input_files,output_files
run1,wc -l $1 > result.txt,/path/to/some/text/file.txt,result.txt
run2,wc -l $1 > result.txt,/path/to/another/text/file.txt,result.txt
Can discuss in person
The result from running the above would be a directory with run1/result.txt and run2/result.txt
GNU parallel can do that for you:
https://www.gnu.org/software/parallel/parallel_tutorial.html#Remote-execution
Wouldn't try to implement something from scratch since I have been trying out different options to do this in a nice and tidy way and can't stress the number of edge cases to be dealt with for a ground-up solution.
Assuming that you don't want to turn this into an advanced task scheduler project, I would just wrap parallel and call it a day ;)