ipyparallel LoadBalancedView bloats memory

This issue may be related to https://github.com/ipython/ipyparallel/issues/207 which is also not marked as solved, yet. Also I posted this problem on stackoverflow (https://stackoverflow.com/questions/45781545/ipyparallels-loadbalancedview-bloats-memory-how-can-i-avoid-that).

I want to execute multiple tasks in parallel using python and ipyparallel in a jupyter notebook and using 4 local engines by executing ipcluster start in a local console. Besides that one can also use DirectView, I use LoadBalancedView to map a set of tasks. Each task takes around 0.2 seconds (can vary though) and each task does a MySQL query where it loads some data and then processes it.
Working with ~45000 tasks works fine, however, my memory grows really high. This is actually bad because I want to run another experiment with over 660000 tasks which I can't run anymore because it bloats up my memory limit of 16 GB and then the memory swapping on my local drive starts. However, when using the DirectView my memory grows relatively small and is never full. But I actually need LoadBalancedView.
Even when running a minimal working example without database query this happens (see below).

I am not perfectly familiar with the ipyparallel library but I've read something about logs and caches that the ipcontroler does which may cause this. I am still not sure if it is a bug or if I can change some settings to avoid my problem.

Running a MWE

For my Python 3.5.3 environment running on Windows 10 I use the following (recent) packages:

ipython 6.1.0
ipython_genutils 6.1.0
ipyparallel 6.0.2
jupyter 1.0.0
jupyter_client 4.4.0
jupyter_console 5.0.0
jupyter_core 4.2.0

I would like the following example to work for LoadBalancedView without the immense memory growth (if possible at all):

Start ipcluster start on a console

Run a jupyter notebook with the following three cells:

  <1st cell>
  import ipyparallel as ipp
  rc = ipp.Client()
  lview = rc.load_balanced_view()

  <2nd cell>
  %%px --local
  import time

  <3rd cell>
  def sleep_here(i):
      time.sleep(0.2)
      return 42

  amr = lview.map_async(sleep_here, range(660000))
  amr.wait_interactive()

Aug 20 '17 11:08 LFishi

Can anyone confirm this? Imo it is not a huge example to reproduce (except I am missing something)

Sep 03 '17 00:09 LFishi

Yes, I can confirm this. I've been investigating the cause, but haven't nailed it down, yet.

Sep 07 '17 13:09 minrk

The load balanced view seems to also struggle with large input arguments in terms of performance:

import ipyparallel
client = ipyparallel.Client()

lview = client.load_balanced_view()
dview = client[:]

def execute(view, n):
    for item in view.map(lambda x: x, range(n), block=False):
        return item

%timeit execute(dview, 10) # 17 ms ± 1.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit execute(dview, 100) # 14.3 ms ± 322 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit execute(dview, 1000) # 14.8 ms ± 381 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit execute(lview, 10) # 45.1 ms ± 7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit execute(lview, 100) # 358 ms ± 19 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit execute(lview, 1000) # 3.34 s ± 375 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Nov 15 '17 11:11 tillahoffmann

I came across this as well with very long async queues. Running on macOS 10.12.5 w/ python 2.7.14

Dec 12 '17 00:12 DavidKopecky

I've been investigating this again and still haven't identified the exact cause of the memory growth, but one mechanism to mitigate it is the chunksize argument. I believe the memory growth is related to the number of tasks (messages sent through the scheduler), which doesn't have to match the number of units in the map sequence. Setting chunksize=10 means that each message will include 10 elements of the map. This changes your task size from 200ms to 2s, and reduces the task count from 600k to 60k. The larger the chunksize, the lower the overhead. At the same time, you have coarser tasks so making it too big results in less smooth load balancing.

The new (in 7.0a2) LoadBalancedView.imap that limits the number of outstanding tasks should also greatly improve memory usage, as not all messages will be in flight at once.

Jun 22 '21 08:06 minrk

ipyparallel
ipyparallel copied to clipboard

LoadBalancedView bloats memory - bug or wrong settings?

Running a MWE

ipyparallel ipyparallel copied to clipboard

LoadBalancedView bloats memory - bug or wrong settings?

Running a MWE

ipyparallel
ipyparallel copied to clipboard