StructuralEquationModels.jl Variables/Parameters/Observations terminology & API Cleanup

Variables/Parameters/Observations terminology & API Cleanup

Open alyst opened this issue 2 months ago • 5 comments

As a part of #193 I already made some changes, so I wanted to get the feedback from maintainers about it. Plus, there are a few other changes in the same direction that I can integrate into #193, so I wanted to mention them here too.

Parameters. Sometimes they are called parameters, sometimes identifiers (in the ParTable). I propose to change it into param (intuitively understandable, but still short):
- param in the ParTable
- params() to get the vector of parameters
- nparams() to get the number of parameters (called n_par() now)
Variables. Sometimes called vars, sometimes colnames, sometimes nodes. Observed variables are sometimes called observed, sometimes manifested. I propose to consolidate into vars (short, but intuitive), which could be observed (more intuitive than manifested) or latent:
- vars() to get the vector of variables from ParTable, RAMMatrices (matching the order of A columns)
- nvars() to get the number of variables
- observed_vars() to get the observed variables matching the order of rows/cols in obs_cov and rows of RAMMatrices.F Alternatively, it could be obs_vars(), which would match obs_cov() and obs_mean() (if observed_vars is chosen, then obs_cov also needs be renamed into observed_cov for consistency).
- nobserved_vars() to get the number of observed vars (replaces n_man, which in this short form is a little bit confusing).
- latent_var_indices()/observed_var_indices() to get the indices of vars() that match the observed/latent variables (i-th index of observed_var_indices() is for the i-th variable of observed_vars())
- latent_vars() is a shortcut to vars()[latent_var_indices()]
- Also, in case of missing data, I propose to use measured/missing terms (now it uses observed/missing, but observed clashes with observed/latent), and nmeasured_vars()/nmissing_vars() to get their counts
Observations. Also referred to as rows. To disambiguate from observed_vars, I propose to refer to as samples (row is confusing because SEM operates with so many matrices).
- samples to access to the individual samples (sometimes referred to as rows or rowwise).
- nsamples() is the number of samples (n_obs() now)
Relations (between the variables, i.e. <- or <->). Now the ParTable have the in param_type column, which is confusing, because sometimes it is constant.

Apr 21 '24 03:04 alyst

StructuralEquationModels.jl StructuralEquationModels.jl copied to clipboard

Variables/Parameters/Observations terminology & API Cleanup

StructuralEquationModels.jl
StructuralEquationModels.jl copied to clipboard