Restarting Emcee
I am attempting to run prospector on a computer cluster with a time limit of 4 hours and my data is not converging in that time. I know prospector has a restart emcee feature however I am not sure how to utilize this function. I have referenced restart.py in scripts and have come up with:
hfile = Galaxy_Path + 'G{0}_{1}_{2}.h5'.format(galaxy_num, gal_desig, Run_Num -1)
result = reader.results_from(hfile, dangerous = False)[0]
lnprobfn_fixed = partial(prospect.fitting.lnprobfn, sps=sps)
output = restart_emcee_sampler(initial=result['chain'][:,-1,:],
lnprobfn=lnprobfn_fixed,
initial_positions = result['chain'][:,-1,:],
**run_params)
Which I believe is taking in my last iteration from the chain stored in results and is attempting to begin emcee production. I am running into an issue implementing lnprobfn_fixed which is throwing the error:
emcee: Exception while calling your likelihood function:
params: [ 1.03860847e+01 9.47240138e-01 6.70151253e-01 4.02459717e-01
4.04899017e-01 6.39318413e-01 1.59374878e+00 -1.80725066e+00
-4.04193471e-01 1.37733417e+00 4.70929708e-01 2.63815226e-04
1.21522140e+01 1.00000000e+10 1.11770402e-03 9.48953731e+01
8.41210224e-01]
args: []
kwargs: {}
exception:
Traceback (most recent call last):
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
return self.f(x, *self.args, **self.kwargs)
File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/fitting.py", line 88, in lnprobfn
lnp_prior = model.prior_product(theta, nested=nested)
AttributeError: 'NoneType' object has no attribute 'prior_product'
Traceback (most recent call last):
File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 557, in <module>
PSB_AGN_CAPS_Funct(galaxy_num = int(Galaxy_list), Run_Num=int(Run_Num), Template_Type = Template_Type, Num_Iters=Num_Iters) #, input_hfile=input_hfile
File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 520, in PSB_AGN_CAPS_Funct
output = restart_emcee_sampler(initial=result['chain'][:,-1,:],
File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 142, in restart_emcee_sampler
esampler = emcee_production(esampler, initial, niter, pool=pool,
File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 185, in emcee_production
for i, result in enumerate(esampler.sample(initial, **mc_args)):
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 344, in sample
state.log_prob, state.blobs = self.compute_log_prob(state.coords)
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 489, in compute_log_prob
results = list(map_func(self.log_prob_fn, p))
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
return self.f(x, *self.args, **self.kwargs)
File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/fitting.py", line 88, in lnprobfn
lnp_prior = model.prior_product(theta, nested=nested)
AttributeError: 'NoneType' object has no attribute 'prior_product'
I have tried different lnprobfn functions with the same error. I have also tried the lnprobability stored in my results dictionary with no luck. Using result['lnprobability'] gives me:
emcee: Exception while calling your likelihood function:
params: [ 1.03860847e+01 9.47240138e-01 6.70151253e-01 4.02459717e-01
4.04899017e-01 6.39318413e-01 1.59374878e+00 -1.80725066e+00
-4.04193471e-01 1.37733417e+00 4.70929708e-01 2.63815226e-04
1.21522140e+01 1.00000000e+10 1.11770402e-03 9.48953731e+01
8.41210224e-01]
args: []
kwargs: {}
exception:
Traceback (most recent call last):
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
return self.f(x, *self.args, **self.kwargs)
TypeError: 'numpy.ndarray' object is not callable
Traceback (most recent call last):
File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 557, in <module>
PSB_AGN_CAPS_Funct(galaxy_num = int(Galaxy_list), Run_Num=int(Run_Num), Template_Type = Template_Type, Num_Iters=Num_Iters) #, input_hfile=input_hfile
File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 520, in PSB_AGN_CAPS_Funct
output = restart_emcee_sampler(initial=result['chain'][:,-1,:],
File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 142, in restart_emcee_sampler
esampler = emcee_production(esampler, initial, niter, pool=pool,
File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 185, in emcee_production
for i, result in enumerate(esampler.sample(initial, **mc_args)):
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 344, in sample
state.log_prob, state.blobs = self.compute_log_prob(state.coords)
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 489, in compute_log_prob
results = list(map_func(self.log_prob_fn, p))
File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
return self.f(x, *self.args, **self.kwargs)
TypeError: 'numpy.ndarray' object is not callable
Is this the correct way to restart emcee? Thank you in advance!
The lnprobfn needs to know the model and the data as well. So something like
result, obs, model = reader.results_from(hfile, dangerous = True)
lnprobfn_fixed = partial(prospect.fitting.lnprobfn, sps=sps, model=model, obs=obs)
should get you closer. That said this is not something we've supported or tried in some time. If problems arise it might be worth asking your admin for an increased time limit.
Thank you for the input. I have fixed the lnprobfn issue but I am still getting the error when running emcee
17:38:33: Start emcee for G11_EAH03_Test_step
niter: 1200
nwalkers: 620
number of walkers=620
starting production
/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/moves/red_blue.py:99: RuntimeWarning: invalid value encountered in double_scalars
lnpdiff = f + nlp - state.log_prob[j]
Based on what another user has said in the emcee repository, it seems like one of my priors is initializing outside of its range. I have asked about this in the Emcee issues repository however I have not gotten a response yet. Do you have any idea why this would be happening?
As for restarting emcee, I am going to look into obtaining more time on the cluster. Thank you for your input!
Hi @EmmaLitzer, sorry I'm not sure what the issue is. If you are worried about parameters starting outside the prior range you can try calling model.prior_product(parameter_value) for the starting location of each walker.
Hi Ben,
Thank you for helping me with this issue. I did find something that allows me to restart the run from the previous run bestfit parameters. I am working on testing if this is a viable option for "restarting" the sampling. If it does, I think this will work well enough for my needs.
What I did: in fitting.py run_emcee change q from model.theta to the bestfit theta array from the previous run
So far it has given me better chi2 fits than one sampler run but I am obtaining logmass values that are converging way off values from SDSS and logmass values previously calculated for my sample (example: known logmass ~10.4, prospector logmass ~8.75). I am trying to figure out where that issue is coming from.
I wonder if there's a units issue in the restarts?