pyhsmm
pyhsmm copied to clipboard
Using particle Gibbs
I just found pyhsmm last week. First a minor issue:
It looks like future is not installed by default in anaconda's distribution of python, so that pip install pyhsmm fails, unless I first install future manually.
Second, I was looking at the paper "The Infinite Hidden Markov Model" by Beal et. al. The first example that they give is of the string 30*'ABCDEFEDCB', and remark that the most parsimonious HMM which models this has 10 states. I'm assuming that the WeakLimitHDPHMM will essentially run their algorithm (of course I have to give a maximum number of states). I defined the following function:
def trainHDPHMM(data, Nmax, iterations=100, alpha_0=1.0, alpha_size=None):
if alpha_size is None:
alpha_size = 1+max(data)
print("Alphabet size=%d"% alpha_size)
obs_hyperparams = {
'alpha_0': alpha_0,
'K': alpha_size
}
obs_distns = [pyhsmm.distributions.Categorical(**obs_hyperparams) for state in range(Nmax)]
posteriormodel = pyhsmm.models.WeakLimitHDPHMM(
alpha_a_0=1., alpha_b_0=0.25,
gamma_a_0=1., gamma_b_0=0.25,
init_state_concentration=1.,
obs_distns=obs_distns)
posteriormodel.add_data(data)
for idx in progprint_xrange(iterations):
posteriormodel.resample_model()
return posteriormodel
When I call this using
model = trainHDPHMM(np.array(map(lambda _: ord(_) - ord('A'), 30*'ABCDEFEDCB')), 40)
It sometimes (it depends on the random seed) gets a traceback like (even though I've done np.seterr(divide='ignore'))
File "
I see that this issue was brought up in https://github.com/numpy/numpy/issues/5851 over 2 years ago, but it still doesn't seem to have been fixed (I'm running numpy version 1.31.1). Is there a workaround (or fix) for this?
But now to what I'm really interested in:
In the paper "Particle Gibbs for Infinite Hidden Markov Models" by Tripuraneni et. al. (in NIPS 2015) they describe a method using particle Gibbs to resample the state trajectory which, they say, greatly sped up apparent convergence. On one model that I'm running I find that I'm still improving, very gradually, even after 20,000 sampling iterations, so I'm interested in trying it. So, what changes would I need to make (more importantly, where) to pyshmm and/or pybasicbayes to add using particle gibbs as an option?