ProgressMeter.jl icon indicating copy to clipboard operation
ProgressMeter.jl copied to clipboard

Overhead scales non-monotonically with `pmap` `batch_size`

Open Socob opened this issue 10 months ago • 2 comments

I’ve noticed that for large values of pmap’s batch_size parameter, the progress meter overhead becomes large, which is especially significant since in the cases where you’d want to increase batch_size (individual items are cheap to compute), this makes the overhead dominate over the actual computation.

Example:

using Distributed
addprocs(6)
@everywhere using ProgressMeter
for batch_size in round.(Int, 10 .^ range(0, 4, length=20))
    @show batch_size
    @showprogress pmap(i -> nothing, 1:200_000; batch_size);
end

Plot of execution time vs. batch_size

So it looks like (in this case?) that there’s a “sweet spot” around batch_size=20. I found this surprising, so I’m not sure what the reason is! But it would be good if this behaved better for large batch_size.

Socob avatar Apr 26 '24 14:04 Socob

what happens in the loop is only put!(channel, true), and I see the same behavior when doing only this (I only tested with 20_000 because I don't have that much patience)

channel = RemoteChannel(()->Channel{Bool}(20_000))
@elapsed pmap(i -> put!(c, true), 1:20_000; batch_size);

so I don't see how we can improve this

MarcMush avatar Jul 13 '24 11:07 MarcMush

Hm, I see. So it rather seems an issue with pmap/asyncmap?

Socob avatar Jul 15 '24 12:07 Socob