factorial_hmm icon indicating copy to clipboard operation
factorial_hmm copied to clipboard

EM on multiple sequences of observed states

Open lucach opened this issue 5 years ago • 2 comments

I would be nice to have the EM method of FullDiscreteFactorialHMM also support multiple sequences of observed states. For instance, the fit method of hmmlearn allows parameters estimation from multiple sequences for basic (i.e., non-factorial) HMMs (see https://github.com/hmmlearn/hmmlearn/blob/master/lib/hmmlearn/base.py#L440-L488).

This probably requires to split MStep in two parts: the former (equivalent of hmmlearn's _accumulate_sufficient_statistics) will collect statistics about the parameters and will need to be called on every sequence of observed states; the latter (equivalent of hmmlearn's _do_mstep) will be required to normalize the parameters and will need to be called only once per iteration.

lucach avatar Jun 20 '19 16:06 lucach

Thanks for your suggestion. The FullDiscreteFactorialHMM is actually not useful for most practical purposes and exists mainly as a simple example. In almost all use cases, one would implement an HMM where the parametrization is not completely general. In such cases, it would be straightforward to use the EStep and MStep for multiple sequences as you described.

On Thu, Jun 20, 2019 at 7:34 PM Luca Chiodini [email protected] wrote:

I would be nice to have the EM method of FullDiscreteFactorialHMM also support multiple sequences of observed states. For instance, the fit method of hmmlearn allows parameters estimation from multiple sequences for basic (i.e., non-factorial) HMMs (see https://github.com/hmmlearn/hmmlearn/blob/master/lib/hmmlearn/base.py#L440-L488 ).

This probably requires to split MStep in two parts: the former (equivalent of hmmlearn's _accumulate_sufficient_statistics) will collect statistics about the parameters and will need to be called on every sequence of observed states; the latter (equivalent of hmmlearn's _do_mstep) will be required to normalize the parameters and will need to be called only once per iteration.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/regevs/factorial_hmm/issues/1?email_source=notifications&email_token=AAEJVP6XFNGDPDSVY24NGXDP3OWQ3A5CNFSM4HZXAOJKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G2YDJNA, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEJVPZGGEVG3J5U4JI2HWLP3OWQ3ANCNFSM4HZXAOJA .

regevs avatar Jun 20 '19 18:06 regevs

Sure, I was just pointing out that, for instance, the MStep in the current form does not directly allow to use multiple sequences. I've used your library in an academic toy example, making the necessary modifications to allow it (see https://github.com/lucach/fhmm_bach/blob/master/factorial_hmm_lib.py#L568-L694).

(Unrelated: I think you may add a copyright notice also at the top of each file, it highlights your deserved credit :) )

lucach avatar Jun 24 '19 12:06 lucach