treeseq-inference icon indicating copy to clipboard operation
treeseq-inference copied to clipboard

Assess adequacy of burn-in period for ARGweaver inference

Open hyanwong opened this issue 8 years ago • 2 comments

Just found out that we don't need to weight the ARGweaver outputs by likelihood. But we probably do need to run argweaver for a certain period of time before outputting the MCMC samples. The way to do this is probably to run arg-sample twice - once for the burn in which only outputs a single .smc file at the end of (say) 2000 iterations, then again to do the sampling, using the --arg argument to pick up the .smc file from the end of the burn-in run.

We also need a simple way to view the .stats files form each run to check that the average likelihoods are asymptoting to some sort of steady state. It might also be good to plot the ARG metric over iteration time too, to compare how that asymptotes over the burn in period

hyanwong avatar Jan 16 '17 18:01 hyanwong

Sounds good to me.

jeromekelleher avatar Jan 16 '17 19:01 jeromekelleher

Burn in is now implemented, but not the ability to view the stats from the burn-in period, so leaving this open. To discuss is whether we want to save output files from this period, at the risk of filling the disk with extra cruft? We already save the XXX_burn.stats file which gives the likelihoods over time, so outputting the .smc burn-in files will only give us the extra ability to look at the metric over the burn-in period. Not sure this is worth it.

hyanwong avatar Jan 17 '17 08:01 hyanwong