halotools icon indicating copy to clipboard operation
halotools copied to clipboard

DistributedHaloCatalog

Open aphearin opened this issue 8 years ago • 4 comments

Implement a halo catalog that can be distributed across nodes of a large cluster using MPI

aphearin avatar Oct 28 '17 17:10 aphearin

This issue will solve https://github.com/bccp/nbodykit/issues/502

I think a plausible easy way of doing this is to ensure each MPI rank contains a spatially localized domain -- then reuse the single node code on each rank.

Here is an object that helps you to distribute objects to domains:

https://github.com/rainwoodman/pmesh/blob/master/pmesh/domain.py#L274

And we were using it here:

https://github.com/bccp/nbodykit/blob/master/nbodykit/base/decomposed.py#L3

and here

https://github.com/bccp/nbodykit/blob/master/nbodykit/algorithms/pair_counters/domain.py#L113

You can probably write a better version of this on your own; or jump start your development with domain.py and _domain.pyx.

Those models that need particles may need to use the 'smoothing' argument of https://github.com/rainwoodman/pmesh/blob/master/pmesh/domain.py#L515

rainwoodman avatar Jul 03 '18 16:07 rainwoodman

@rainwoodman - thanks a lot for the pointers. A spatial domain decomposition is indeed what I thought best for this problem. The only difference is that I have been using a buffer region around each domain the size of rmax, the largest pair-counting distance. It looks you handled this without this feature, but perhaps I read it too quickly?

aphearin avatar Jul 03 '18 16:07 aphearin

I actually don't really know what Nick did for this. It smelled lile he did the same, but he could have been assuming the full catalog fits into a single rank here as well. It is just a few lines of changes though.

My bigger concern was the population mock part. What's your plan about this? Are there models that doesnt like this?

On Tue, Jul 3, 2018, 9:37 AM Andrew Hearin [email protected] wrote:

@rainwoodman https://github.com/rainwoodman - thanks a lot for the pointers. A spatial domain decomposition is indeed what I thought best for this problem. The only difference is that I have been using a buffer region around each domain the size of rmax, the largest pair-counting distance. It looks you handled this without this feature, but perhaps I read it too quickly?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/astropy/halotools/issues/825#issuecomment-402219168, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIbTBMRbudgmFZzwwrGMRYD4fxGyF4jks5uC53UgaJpZM4QKA5W .

rainwoodman avatar Jul 03 '18 18:07 rainwoodman

The mock-population part is truly trivially parallelizable - no models in the entire library would be impacted by the decomposition. However, the reason that this feature actually requires a very significant rewrite is that in order to fully take advantage of the parallelism, the summary statistics kernels need to be computed on the subvolumes, and then only the results are reported to rank 0 which collects things like subvolume pair-counts, sums then and converts them into the tpcf.

aphearin avatar Jul 03 '18 19:07 aphearin