pymc-examples
pymc-examples copied to clipboard
Confirmatory Factor Analysis and Structural Equation Models
Notebook proposal
Title: Confirmatory Factor Analysis and Structural Equation Models
Why should this notebook be added to pymc-examples?
This fills a gap in the coverage we have of CFA and SEM models highlighting in particular their role in the analysis of psychometric survey data. It's super interesting and tied to Judea Pearl style causal inference on DAGs.
DATA
Basic CFA example in PyMC
coords = {'obs': list(range(len(df_p))),
'indicators': ['PI', 'AD', 'IGC', 'FI', 'FC'],
'indicators_1': ['PI', 'AD', 'IGC'],
'indicators_2': ['FI', 'FC'],
'latent': ['Student', 'Faculty']
}
obs_idx = list(range(len(df_p)))
with pm.Model(coords=coords) as model:
Psi = pm.InverseGamma('Psi', 5, 10, dims='indicators')
lambdas_ = pm.Normal('lambdas_1', 1, 10, dims=('indicators_1'))
lambdas_1 = pm.Deterministic('lambdas1', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_1'))
lambdas_ = pm.Normal('lambdas_2', 1, 10, dims=('indicators_2'))
lambdas_2 = pm.Deterministic('lambdas2', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_2'))
tau = pm.Normal('tau', 3, 10, dims='indicators')
kappa = 0
sd_dist = pm.Exponential.dist(1.0, shape=2)
chol, _, _ = pm.LKJCholeskyCov('chol_cov', n=2, eta=2,
sd_dist=sd_dist, compute_corr=True)
ksi = pm.MvNormal('ksi', kappa, chol=chol, dims=('obs', 'latent'))
m1 = tau[0] + ksi[obs_idx, 0]*lambdas_1[0]
m2 = tau[1] + ksi[obs_idx, 0]*lambdas_1[1]
m3 = tau[2] + ksi[obs_idx, 0]*lambdas_1[2]
m4 = tau[3] + ksi[obs_idx, 1]*lambdas_2[0]
m5 = tau[4] + ksi[obs_idx, 1]*lambdas_2[1]
mu = pm.Deterministic('mu', pm.math.stack([m1, m2, m3, m4, m5]).T)
_ = pm.Normal('likelihood', mu, Psi, observed=df_p.values)
idata = pm.sample(nuts_sampler='numpyro', target_accept=.95,
idata_kwargs={"log_likelihood": True})
idata.extend(pm.sample_posterior_predictive(idata))
Suggested categories:
- Level: Intermediate.
Related notebooks
Perhaps this one: https://www.pymc.io/projects/examples/en/latest/case_studies/factor_analysis.html But it seems to be recount factor analysis more as a machine learning feature reduction technique than as a means of analysis as per the psychometrics use-case.
References
Will likely adapt (WIP) a blog post i'm working on here: https://nathanielf.github.io/posts/post-with-code/CFA_AND_SEM/CFA_AND_SEM.html
The original work references the book Bayesian Psychometric Modeling by Mislevey and Levy