imageproc icon indicating copy to clipboard operation
imageproc copied to clipboard

Interest for a parallel version of match_template?

Open stephanemagnenat opened this issue 2 years ago • 6 comments

Hello,

For a personal project, I wrote a parallel version of match_template, keeping essentially the same algorithm but using a Rayon parallel iterator for the y coordinate and collecting the result.

If there is interest here, I'm happy to do a PR with my changes. It would be good to know whether this should be an addition or a replacement, and if some tests are needed.

kind regards

stephanemagnenat avatar Jun 01 '23 14:06 stephanemagnenat

How fast is it in comparison? (If you benchmarked it)

zuixalias avatar Jun 04 '23 17:06 zuixalias

I'll do a benchmark and report the results tomorrow, but I would say, close to the number of cores.

stephanemagnenat avatar Jun 04 '23 19:06 stephanemagnenat

On an AMD Ryzen 9 7950X 16-Core Processor (running Ubuntu 22.04.2 LTS), looking for a 259 x 415 template in a 750 x 750 image took 10.4s with the current algorithm, and 974ms with my new version, so a speed-up of more than a factor of 10. This is lower than the core count, but not unexpected due to the thermal limitations of the CPU preventing all cores running at the same speed as a single core.

Funnily, running the parallel version but restricted to one cpu with cpulimit takes 9.98s, so less that the non-parallel version. This is reproducible. I do not know whether this is due to the usage of iterators in the parallel version or somehow an artifact of cpulimit.

In any case, I would say that the speed-up is worth considering integration, do you agree? If so, do you want the new function as a replacement or as an additional function? I can then prepare a PR.

stephanemagnenat avatar Jun 05 '23 12:06 stephanemagnenat

Okay, sounds good.

Funnily, running the parallel version but restricted to one cpu with cpulimit takes 9.98s, so less that the non-parallel version

I see

In any case, I would say that the speed-up is worth considering integration

For sure

do you want the new function as a replacement or as an additional function?

Additional is what i would prefer..

Thanks a lot for writing this parallel version btw...i was looking for it and found this issue in just the right time. I hope your PR gets merged asap, otherwise i would like to ask you to share the code with me..let's see, thanks again!

zuixalias avatar Jun 05 '23 17:06 zuixalias

The PR is ready: #527.

I went for simplicity, if we are all happy about this new feature, it might be possible to de-duplicate some code between the original and the parallel version, or even, only use the parallel version by just changing the loop type depending on the availability of rayon.

stephanemagnenat avatar Jun 06 '23 07:06 stephanemagnenat

Any maintainer to approve the PR?

stephanemagnenat avatar Jun 14 '23 07:06 stephanemagnenat

Implemented in https://github.com/image-rs/imageproc/pull/527

theotherphil avatar May 05 '24 19:05 theotherphil