QuantEcon.py icon indicating copy to clipboard operation
QuantEcon.py copied to clipboard

Parallelization of DiscreteDP solvers

Open jstac opened this issue 9 years ago • 9 comments
trafficstars

Is it possible to modify the solvers within DiscreteDP so that they can exploit multiple cores? This is a question from a user (Jack Shin) and I'm sure it would be very valuable to users if that could be done transparently (and if they had access to a machine with sufficiently many cores).

Same question goes for the Julia code, come to think of it.

@spencerlyon2 @cc7768 @oyamad Any thoughts?

jstac avatar Jul 25 '16 14:07 jstac

Hi @jstac this would be very cool.

Off the top of my head, the two most obvious places for parallelism to me are:

  1. Doing the matrix operation in the bellman operator here
  2. Doing s_wise_max here or here

I think the most straightforward way to get this in python would be to figure out how to write those operations in a way where numba could automatically parallelize them for us.

sglyon avatar Jul 25 '16 20:07 sglyon

I guess we should fix target usecases. (I have had no experience to try to solve a large size problem.)

  1. If the case is that the user wants to use DiscreteDP.solve many times in a for-loop, s/he might want to parallelize the for-loop (I don't know how). In this case, I am worried about @spencerlyon2 's concern from https://github.com/QuantEcon/QuantEcon.py/pull/253#issuecomment-211384859 (although I am not sure if it applies to this case).
  2. DiscreteDP is constrained by the machine's memory size, due to its design that the R and Q arrays must be prepared in advance. So with a machine with 16GB for example, a single execution of DiscreteDP.solve shouldn't take much time. Going back to the discussion following https://github.com/QuantEcon/QuantEcon.py/issues/185#issuecomment-197323058, would Dask help in parallelization here?
  3. We should do profiling first anyway. I have some benchmarks here.

oyamad avatar Jul 26 '16 01:07 oyamad

I agree with the need to profile.

Question: Can we automate parallelization of the max step in application of T or computation of greedy policies and get significant gains through the jit compiler? See In[23] of this notebook:

http://nbviewer.jupyter.org/gist/natashawatkins/2ba8acca8dde831f4cafc09b9990b91c

(Thanks to @natashawatkins)

The gains there are large. But this would need to be tested on a variety of input data.

jstac avatar Aug 27 '17 21:08 jstac

We may try Numba's new parallelization technology.

oyamad avatar Aug 28 '17 00:08 oyamad

Yes, good point.

In my understanding this is still experimental, whereas the parallel flag on @vectorize is already standard.

jstac avatar Aug 28 '17 00:08 jstac

For this function we can compare @guvectorize and prange. For the other functions I don't know how to use @guvectorize.

oyamad avatar Aug 28 '17 03:08 oyamad

Just to check, is the parallelization already finished or still open to contribution?

zhoujianberkeley avatar Oct 16 '20 07:10 zhoujianberkeley

As far as I know this is still an open issue and we'd love to see it pushed forward. Thanks for your interest @zhoujianberkeley .

@oyamad / @mmcky, do you know if any work has been done on this?

jstac avatar Oct 16 '20 22:10 jstac

No part has been parallelized. Thanks for your interest @zhoujianberkeley!

oyamad avatar Oct 17 '20 01:10 oyamad