ipyparallel
ipyparallel copied to clipboard
Why load_balanced_view use so many memory?

The IPython client is spending all its time serializing 100k messages here. The main thing is that a load balanced view creates one message per item by default, which is 100k tasks in this case. That's a lot! The second is that IPython has special-handling of numpy arrays, preserving the arguments as numpy arrays through serialization, and reconstructing the result as a numpy array. That means that IPython is creating 100,000 single-field numpy arrays to send and then serializing them. This saves a lot when sending large arrays, but costs a lot when sending a very large number of tiny arrays. It's not sending 1 100k times, it's sending np.ndarray([1]) 100k times. That's where ~all the time and memory is being spent.
You'll see much better behavior if you don't use a numpy array for this very simple case, or if you use e.g. chunksize=1000 to reduce the number of messages.
You might also consider the new LoadBalancedView.imap which sends messages more efficiently when you have a very large input stream by only submitting a limited number of messages and waiting for results before preparing and serializing more.
How to wait_interactivefor imap result?
imap is a generator, unilke other AsyncResult objects, so AsyncResult methods are not available. You can use tqdm directly, if you like:
source = range(1024)
gen = view.imap(lambda x: x, source)
for result in tqdm.tqdm(gen, total=len(source)):
...
I don't know why it take a long time but no progress.

You can see it only take 30+ms at local per combination.

After you cancel, what do you get for c.queue_status()?
It has been more than 1 minute so far and it is still stucks.

I take a look the Windows Task Manager, python no CPU usage, but it still stucks.
Do simple executions work? rc[:].apply_sync(os.getpid) and view.apply_sync(os.getpid)?
It still stucks, I think I should restart the kernel.

After cancel, it still stucks, perhaps because cluster has been released by with ... as ...?

Can you call getpid before the code that's causing the problem?
Seems works well.

Does map work with a simpler operation (start with echo, maybe return the same data type as your real task)? I'm trying to isolate the issue. A hang is certainly weird unless the processes are actually stuck on something. It's very strange that queue_status() would hang, since that doesn't involve the engines at all. Make sure you are calling that within the context manager if you are using it, or otherwise while the client's connection is still open and the controller still running.
It seems very slow. My machine has 24 cores.
