pomegranate
pomegranate copied to clipboard
HMM state probabilities
Hi!
Is there a way to find the state probabilities for each step in the sequence?
Thanks and good day.
Hi, if you are talking about an HMM, you can use predict_proba()
, and each component of the vector corresponds to the stateid. This uses the forward-backward algorithm. If you want to use only the forward, you can use hmm_model.forward()
.
Thanks. I tried predict_proba()
but when I took a look at the selected states they don't necessarily correspond to the highest probabilities (though most do). What am I missing?
Thanks.
I'm not sure if you're saying that the selected states from model.predict
don't match the highest probabilities from model.predict_proba
, or if the highest probability states in model.predict_proba
don't match the highest probability states in model.forward
, so I'll answer both.
First, the forward
algorithm begins at the start of the sequence, aligning observations to states in the model. Each probability in the returned matrix is the probability of starting at the beginning of the sequencing and aligning observations to any state in the model, over any path through the model, to eventually align this observation to this state. The backward
algorithm works much the same way, except it begins by aligning the final observation to the end state and goes backwards from there. The forward_backward
algorithm, wrapped by predict_proba
, combines these probabilities and then normalizes them per-observation. It's basically saying, "given all paths of aligning observations to states up until this point, and all paths aligning observations to states after this point, what state is most likely for this observation?" It has information that the forward
algorithm does not have access to.
Second, the algorithm in model.predict
is the Viterbi algorithm, which is returning the maximum likelihood single path through the model, whereas model.predict_proba
is returning probabilities from the forward-backward algorithm.
Thanks! It is the first, the selected states from model.predict
don't match the highest probabilities from model.predict_proba
. I don't suppose there's a similar model.predict_proba
function for model.predict
that outputs state probabilities for the Viterbi algorithm?
Thank you for opening an issue. pomegranate has recently been rewritten from the ground up to use PyTorch instead of Cython (v1.0.0), and so all issues are being closed as they are likely out of date. Please re-open or start a new issue if a related issue is still present in the new codebase.