StructuralEquationModels.jl
StructuralEquationModels.jl copied to clipboard
Variables/Parameters/Observations terminology & API Cleanup
As a part of #193 I already made some changes, so I wanted to get the feedback from maintainers about it. Plus, there are a few other changes in the same direction that I can integrate into #193, so I wanted to mention them here too.
-
Parameters. Sometimes they are called parameters, sometimes identifiers (in the ParTable).
I propose to change it into
param
(intuitively understandable, but still short):-
param
in the ParTable -
params()
to get the vector of parameters -
nparams()
to get the number of parameters (calledn_par()
now)
-
-
Variables. Sometimes called vars, sometimes colnames, sometimes nodes.
Observed variables are sometimes called observed, sometimes manifested.
I propose to consolidate into vars (short, but intuitive), which could be observed (more intuitive than manifested) or latent:
-
vars()
to get the vector of variables fromParTable
,RAMMatrices
(matching the order ofA
columns) -
nvars()
to get the number of variables -
observed_vars()
to get the observed variables matching the order of rows/cols inobs_cov
and rows ofRAMMatrices.F
Alternatively, it could beobs_vars()
, which would matchobs_cov()
andobs_mean()
(ifobserved_vars
is chosen, thenobs_cov
also needs be renamed intoobserved_cov
for consistency). -
nobserved_vars()
to get the number of observed vars (replacesn_man
, which in this short form is a little bit confusing). -
latent_var_indices()
/observed_var_indices()
to get the indices ofvars()
that match the observed/latent variables (i-th index ofobserved_var_indices()
is for the i-th variable ofobserved_vars()
) -
latent_vars()
is a shortcut tovars()[latent_var_indices()]
- Also, in case of missing data, I propose to use measured/missing terms (now it uses observed/missing, but observed clashes with observed/latent), and nmeasured_vars()/nmissing_vars() to get their counts
-
-
Observations. Also referred to as rows. To disambiguate from observed_vars, I propose to refer to as samples (row is confusing because SEM operates with so many matrices).
-
samples
to access to the individual samples (sometimes referred to asrows
orrowwise
). -
nsamples()
is the number of samples (n_obs()
now)
-
-
Relations (between the variables, i.e.
<-
or<->
). Now theParTable
have the inparam_type
column, which is confusing, because sometimes it is constant.