juliet icon indicating copy to clipboard operation
juliet copied to clipboard

Multiprocessing with `Pool` won't work on Mac with M1 chip

Open Jayshil opened this issue 1 year ago • 3 comments

Hi @nespinoza,

If I try to use nthreads option in juliet.fit for using multiprocessing with dynesty, I would get the following error:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

At first, I thought it is the multiprocessing implemented in dynesty that is causing this issue. However, upon digging into the details I found that this has rather do with how multiprocessing is implemented in juliet: the culprit is line 1723. A proper way to do this (according to this StackOverflow answer) is following:

from multiprocessing import get_context
...
with contextlib.closing(get_context("fork").Pool(processes=self.nthreads - 1)) as executor:
    sampler = DynestySampler(self.loglike,\
        self.prior_transform_r,\
        self.data.nparams,
        pool=executor,
        queue_size=self.nthreads,
        **d_args)
    sampler.run_nested(**ds_args)
    results = sampler.results

(instead of directly calling Pool from multiprocessing, one should call it from multiprocessing.get_context("fork").Pool). Making this change would solve the above issue.

However, I am not opening a pull request since it looks like this issue is specific to Mac users with an M1 chip. And I have no idea how implementing this would affect other systems which don't have M1. Also since this is the issue with Pool implementation, it should also affect emcee and zeus samplers which also use Pool for multiprocessing (though I haven't tested for samplers other than dynesty).

Cheers, Jayshil

Jayshil avatar Oct 10 '22 12:10 Jayshil

Hi @Jayshil! Thanks for this. Can you confirm that the way you propose to do multiprocesing is the way to make this work? Also, to change this, would imply to also test on non-M1-chip computers to ensure this is not going to fix the M1-chip users, but make everyone else's code to break.

I'll leave it as an enhancement for now, but if this is investigated further, I would be happy to change the way in which pool is handled.

N.

nespinoza avatar Feb 15 '23 14:02 nespinoza

Hi @nespinoza,

I am not really sure if this is the only way to make this work or not. There may be other ways to resolve this issue that I am unaware of. But I can confirm that this is at least one way to make it work for M1 chip computers (working smoothly on my machine). I also don't know if this would work for non-M1-chip computers or not. Unfortunately, I do not have enough time to thoroughly investigate this.

I think your suggestion is appropriate to leave this as an enhancement for now. This will let users know that there is an issue with pool in M1 chip Macs and there is a possible way to resolve this.

Cheers, Jayshil

Jayshil avatar Feb 15 '23 14:02 Jayshil

Hi @nespinoza,

A colleague of mine pointed out that instead of editing the juliet source code, one can simply add the following two lines in their code to make multiprocessing work with M1 chip macs:

import multiprocessing
multiprocessing.set_start_method('fork')

This is at least working with dynesty. So, I suggest we leave juliet source code as it is, but put this "hack" somewhere in the documentation. We can close this issue after that.

Cheers, Jayshil

Jayshil avatar Mar 14 '23 21:03 Jayshil