prospector icon indicating copy to clipboard operation
prospector copied to clipboard

Restarting Emcee

Open EmmaLitzer opened this issue 4 years ago • 5 comments

I am attempting to run prospector on a computer cluster with a time limit of 4 hours and my data is not converging in that time. I know prospector has a restart emcee feature however I am not sure how to utilize this function. I have referenced restart.py in scripts and have come up with:

hfile = Galaxy_Path + 'G{0}_{1}_{2}.h5'.format(galaxy_num, gal_desig, Run_Num -1)
result = reader.results_from(hfile, dangerous = False)[0]
lnprobfn_fixed = partial(prospect.fitting.lnprobfn, sps=sps)
output = restart_emcee_sampler(initial=result['chain'][:,-1,:], 
                lnprobfn=lnprobfn_fixed,
                initial_positions = result['chain'][:,-1,:],
                **run_params) 

Which I believe is taking in my last iteration from the chain stored in results and is attempting to begin emcee production. I am running into an issue implementing lnprobfn_fixed which is throwing the error:

emcee: Exception while calling your likelihood function:
  params: [ 1.03860847e+01  9.47240138e-01  6.70151253e-01  4.02459717e-01
  4.04899017e-01  6.39318413e-01  1.59374878e+00 -1.80725066e+00
 -4.04193471e-01  1.37733417e+00  4.70929708e-01  2.63815226e-04
  1.21522140e+01  1.00000000e+10  1.11770402e-03  9.48953731e+01
  8.41210224e-01]
  args: []
  kwargs: {}
  exception:
Traceback (most recent call last):
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
    return self.f(x, *self.args, **self.kwargs)
  File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/fitting.py", line 88, in lnprobfn
    lnp_prior = model.prior_product(theta, nested=nested)
AttributeError: 'NoneType' object has no attribute 'prior_product'
Traceback (most recent call last):
  File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 557, in <module>
    PSB_AGN_CAPS_Funct(galaxy_num = int(Galaxy_list), Run_Num=int(Run_Num), Template_Type = Template_Type, Num_Iters=Num_Iters) #, input_hfile=input_hfile
  File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 520, in PSB_AGN_CAPS_Funct
    output = restart_emcee_sampler(initial=result['chain'][:,-1,:], 
  File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 142, in restart_emcee_sampler
    esampler = emcee_production(esampler, initial, niter, pool=pool,
  File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 185, in emcee_production
    for i, result in enumerate(esampler.sample(initial, **mc_args)):
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 344, in sample
    state.log_prob, state.blobs = self.compute_log_prob(state.coords)
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 489, in compute_log_prob
    results = list(map_func(self.log_prob_fn, p))
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
    return self.f(x, *self.args, **self.kwargs)
  File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/fitting.py", line 88, in lnprobfn
    lnp_prior = model.prior_product(theta, nested=nested)
AttributeError: 'NoneType' object has no attribute 'prior_product'

I have tried different lnprobfn functions with the same error. I have also tried the lnprobability stored in my results dictionary with no luck. Using result['lnprobability'] gives me:

emcee: Exception while calling your likelihood function:
  params: [ 1.03860847e+01  9.47240138e-01  6.70151253e-01  4.02459717e-01
  4.04899017e-01  6.39318413e-01  1.59374878e+00 -1.80725066e+00
 -4.04193471e-01  1.37733417e+00  4.70929708e-01  2.63815226e-04
  1.21522140e+01  1.00000000e+10  1.11770402e-03  9.48953731e+01
  8.41210224e-01]
  args: []
  kwargs: {}
  exception:
Traceback (most recent call last):
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
    return self.f(x, *self.args, **self.kwargs)
TypeError: 'numpy.ndarray' object is not callable
Traceback (most recent call last):
  File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 557, in <module>
    PSB_AGN_CAPS_Funct(galaxy_num = int(Galaxy_list), Run_Num=int(Run_Num), Template_Type = Template_Type, Num_Iters=Num_Iters) #, input_hfile=input_hfile
  File "/mnt/c/Users/emma_d/ASTR_Research/Restart_Emcee_Step.py", line 520, in PSB_AGN_CAPS_Funct
    output = restart_emcee_sampler(initial=result['chain'][:,-1,:], 
  File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 142, in restart_emcee_sampler
    esampler = emcee_production(esampler, initial, niter, pool=pool,
  File "/mnt/c/Users/emma_d/ASTR_Research/lib/python3.8/site-packages/repo/prospector/prospect/fitting/ensemble.py", line 185, in emcee_production
    for i, result in enumerate(esampler.sample(initial, **mc_args)):
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 344, in sample
    state.log_prob, state.blobs = self.compute_log_prob(state.coords)
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 489, in compute_log_prob
    results = list(map_func(self.log_prob_fn, p))
  File "/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/ensemble.py", line 624, in __call__
    return self.f(x, *self.args, **self.kwargs)
TypeError: 'numpy.ndarray' object is not callable

Is this the correct way to restart emcee? Thank you in advance!

EmmaLitzer avatar Jan 08 '22 05:01 EmmaLitzer

The lnprobfn needs to know the model and the data as well. So something like

result, obs, model = reader.results_from(hfile, dangerous = True)
lnprobfn_fixed = partial(prospect.fitting.lnprobfn, sps=sps, model=model, obs=obs)

should get you closer. That said this is not something we've supported or tried in some time. If problems arise it might be worth asking your admin for an increased time limit.

bd-j avatar Jan 11 '22 23:01 bd-j

Thank you for the input. I have fixed the lnprobfn issue but I am still getting the error when running emcee

17:38:33: Start emcee for G11_EAH03_Test_step
        niter: 1200
        nwalkers: 620
number of walkers=620
starting production
/home/emma/miniconda3/envs/fsps-test/lib/python3.9/site-packages/emcee/moves/red_blue.py:99: RuntimeWarning: invalid value encountered in double_scalars
  lnpdiff = f + nlp - state.log_prob[j]

Based on what another user has said in the emcee repository, it seems like one of my priors is initializing outside of its range. I have asked about this in the Emcee issues repository however I have not gotten a response yet. Do you have any idea why this would be happening?

As for restarting emcee, I am going to look into obtaining more time on the cluster. Thank you for your input!

EmmaLitzer avatar Jan 14 '22 23:01 EmmaLitzer

Hi @EmmaLitzer, sorry I'm not sure what the issue is. If you are worried about parameters starting outside the prior range you can try calling model.prior_product(parameter_value) for the starting location of each walker.

bd-j avatar Jan 23 '22 20:01 bd-j

Hi Ben,

Thank you for helping me with this issue. I did find something that allows me to restart the run from the previous run bestfit parameters. I am working on testing if this is a viable option for "restarting" the sampling. If it does, I think this will work well enough for my needs.

What I did: in fitting.py run_emcee change q from model.theta to the bestfit theta array from the previous run

So far it has given me better chi2 fits than one sampler run but I am obtaining logmass values that are converging way off values from SDSS and logmass values previously calculated for my sample (example: known logmass ~10.4, prospector logmass ~8.75). I am trying to figure out where that issue is coming from.

EmmaLitzer avatar Mar 07 '22 23:03 EmmaLitzer

I wonder if there's a units issue in the restarts?

bd-j avatar Mar 08 '22 15:03 bd-j