Issue/838 hmm
Submission Checklist
- [x] Builds locally
- [x] New functions marked with
<<{ since VERSION }>> - [x] Declare copyright holder and open-source license: see below
Summary
Addresses issue #838 and updates user-doc on HMMs.
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Charles Margossian, Simons Foundation
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC BY-ND 4.0 (https://creativecommons.org/licenses/by-nd/4.0/)
I took a look through this and while I'm in favor of minimal examples, this one's a bit too minimal. I would really like to see the model laid out in more detail than the conditional distributions. Here are some concrete suggestions:
-
Use
zfor latent discrete parameters. We already usexfor covariates, so it's confusing to usex. I've usedzelsewhere in the user manual for this in the latent discrete parameter chapter. -
Define the complete data likelihood $p(y, z | \phi) = p(z \mid \phi) \cdot p(y \mid z, \phi)$ and then the marginalization $p(y \mid \phi)$ that we actually fit.
-
Mention that $\phi$ is actually several parameters.
-
I find the constrained version of the transition matrix where you insert zeros makes this first example too challenging. If you want to discuss that case, I'd suggest another section after this one where you talk about imposing structural zeros. Otherwise it makes the simple case seem too complicated. And then you can just define
array[3] simplex[3] gamma_arr;
matrix[3, 3] gamma;
for (n in 1:3) gamma[n] = gamma_arr[n];
-
For the doc, those
muandsigmaare not just the measurement model---that's all the error terms. -
Wherever you have repetition, use loops. It's less error prone and more clear that it's a homogeneous operation:
for (n in 1:N) {
for (k in 1:3) {
log_omega[k, n] = normal_lpdf(y[n] | mu[k], sigma);
}
}
-
Given that you're tying the parameter
sigmaacross outputs, you need to mention that. I'd recommend just keeping this simple with a vector ofsigmavalues. -
You can't just say "computes the relevant log marginal distribution"---you have to say what that is. I don't mean including the marginalization algorithm, I just mean as I wrote it above.
-
For more details, since -> For more details, see. Also, I wouldn't say "corresponding case study", I'd just say it's a case study on HMMs. And then you should put it in the bibtex file and cite it properly with reference to Ben (the author). And you should cite that you "borrowed" the example from Ben Bales's case study.
I'm rewriting the code to make the transition matrix less constrained (per comment 4) and I wanted to check: we don't have a stochastic matrix type, right?
We have row and column but not double yet https://mc-stan.org/docs/reference-manual/types.html#stochastic-matrices
@bob-carpenter I implemented your feedback.
Some questions:
-
What exactly is the difference between a measurement and an error model? I'm ok to use either term, and I know they have different conceptual implication, but I'm wondering if they have a formal definition.
-
I'm keeping
sigmaa scalar. I'm not sure it's simpler conceptually to pass it as a vector.
For (5), I think the idea's that there are three sources of error: measurement error, modeling error, and sampling error. For example, sampling error arises when you subsample a population and use that for estimation. You get modeling error if you use a linear regression for a relationship that's not linear or use normal errors when the errors are skewed, and so on. If you're weighing things with a scale and you know the scale's biased to the high side, you can correct that measurement error. You can explicitly add a measurement error model if you know your measurement model (e.g., gravitational lensing is part of the measurement error model; your work with Bruno et al. on deconvolving galactic dust is part of the measurement model for the CMB, etc.).
@bob-carpenter I completely missed your review. Sorry about that. Hopefully we'll merge this in soon.