script icon indicating copy to clipboard operation
script copied to clipboard

Parallel execution for ExecForEach and EachLine

Open gty929 opened this issue 3 years ago • 9 comments

Hi, this issue follows the discussion in #86. Up to now, the script runs in a fully synchronous manner. While in #34 and #59 people have come up with some brilliant ideas in asynchronous pipeline streaming, the designs and implementations just seem too complicated, as @bitfield mentioned.

Here, I want to suggest a compromise -- adding methods EachLineConc() and ExecForEachConc(), which should have the same input and output interface as EachLine() and ExecForEach(), but enable parallel execution within the method. (E.g., script.Slice(make([]string, 5)).ExecForEachConc("sleep 1") should return in about 1 second, rather than 5 seconds.)

The ExecForEachConc() method shares the same use cases with GNU parallel and the & symbol in bash, e.g. : https://blogs.sas.com/content/sgf/2021/04/14/using-shell-scripts-for-massively-parallel-processing/ https://unix.stackexchange.com/questions/103920/parallelize-a-bash-for-loop/103922 https://askubuntu.com/questions/431478/decompressing-multiple-files-at-once https://superuser.com/questions/538164/how-many-instances-of-ffmpeg-commands-can-i-run-in-parallel/547340#547340

Apart from improving efficiency, I sometimes want to run programs concurrently for testing purposes. For example, two weeks ago, I just wrote a file transmission service, and I would like to test whether the receiver application behaves correctly when multiple files are sent to it at the same time. It would be nice if I could run something like: script.ListFiles("FILE*.in").ExecForEachConc("./wSender --ip 10.0.0.1 --port 8888 --file {{.}}").Stdout()

Note: it seems that we cannot directly append & to the commands in ExecForEach() for this purpose. script.Slice(make([]string, 5)).ExecForEach("sleep 1 &") returns instantly, because there is no 'wait'. Also, if I change the argument to "bash -c 'sleep 1' &" or "bash -c 'sleep 1 &'", the program will still run for 5 seconds.

gty929 avatar Nov 20 '21 07:11 gty929

The implementation won't be too hard. We just need to rewrite EachLineConc(), and let ExecForEachConc() call this new method. Here's an implementation that I first think of (which can for sure be further optimized):

func (p *Pipe) EachLineConc(process func(string, *strings.Builder)) *Pipe {
	if p == nil || p.Error() != nil {
		return p
	}
	scanner := bufio.NewScanner(p.Reader)
	inputs := []string{}
	for scanner.Scan() {
		inputs = append(inputs, scanner.Text())
	}
	err := scanner.Err()
	if err != nil {
		p.SetError(err)
		return p
	}
	lineNum := len(inputs)
	outputs := make([]string, lineNum)
	latch := sync.WaitGroup{}
	latch.Add(lineNum)
	for index, input := range inputs {
		go func(index int, input string) {
			output := strings.Builder{}
			process(input, &output)
			outputs[index] = output.String()
			latch.Done()
		}(index, input)
	}
	latch.Wait()
	if p.Error() != nil {
		return p
	}
	return Echo(strings.Join(outputs, ""))
}

gty929 avatar Nov 20 '21 07:11 gty929

Great! Let's see if this issue gets some traction with others who want to write concurrent scripts, and see what they think of the proposed API.

bitfield avatar Nov 20 '21 11:11 bitfield

Hi, just found this project today and very much love the idea! About this discussion, I immediately thought about this going through the README. Plus for this one.

ghetto-ch avatar Nov 29 '21 20:11 ghetto-ch

Great! Do you want to try and come up with a real-world program that would use these constructs?

bitfield avatar Nov 30 '21 11:11 bitfield

Love the ability this provides.

gedw99 avatar May 11 '22 20:05 gedw99

Plus one for this idea.

tjayrush avatar Sep 23 '22 00:09 tjayrush

Can you suggest an example where this might be useful, @tjayrush? Ideally, write a script program using this construct that solves a user problem.

bitfield avatar Sep 23 '22 09:09 bitfield

It's obviously useful. Every application I can think of where something can be done concurrently is useful even if only because it's way faster. Use your imagination.

tjayrush avatar Sep 24 '22 01:09 tjayrush

I understand perhaps you're feeling frustrated, @tjayrush, and for all I know you're having a bad day for reasons unrelated to this issue or this project. But the tone of your comment is ill-judged, I think. I invite you to reflect on it and consider whether it's the sort of comment you'd like to receive from a contributor to one of your own projects, or even from a co-worker.

bitfield avatar Sep 24 '22 10:09 bitfield