tidySEM icon indicating copy to clipboard operation
tidySEM copied to clipboard

"Could not complete thresholds; either specify all thresholds by hand, or remove constraints."

Open marvas74 opened this issue 1 year ago • 17 comments

Hi,

I am trying to have a first go on LCA following https://www.tandfonline.com/doi/full/10.1080/10705511.2023.2250920 .

As variables I have ordered factors of 5 levels df <- mxFactor(na.omit(data),levels=c(1,2,3,4,5)) dim(df) [1] 196 5

If I do: res <- mx_lca(data = df, classes = 1:3)

this often works, but not always.

If I do: set.seed(123) res <- mx_lca(data = df, classes = 1:3) I usually get: Error in update_thresholds(zscore) : Could not complete thresholds; either specify all thresholds by hand, or remove constraints.

When trying an other similar data set mx_lca() always fails with the same error regardless of if I try to set the seed or not.

From where could this erratic behaviour stem from and what could be done to it?

Any help would be much appreciated :-)

marvas74 avatar Dec 29 '23 13:12 marvas74

Could you share a reproducible example?

The error occurs when it's not possible to automatically specify starting values for the thresholds. The straightforward solution is thus to specify starting values for the thresholds by hand. But the bigger question is: why is it not possible to automatically specify starting values here? The model might be too complex for the data, I'd recommend looking at the raw data distribution.

cjvanlissa avatar Jan 11 '24 07:01 cjvanlissa

Sometimes this runs and sometimes I get error that I reported above. Using the set.seed() seems to make it less likely to run, but of course I have not tested comprehensively.

df <-bind_cols( A=sample(x=c(1:5),replace = TRUE,size = 200,prob=c(0.00, 0.00, 0.05, 0.17, 0.78 )), B= sample(x=c(1:5),replace = TRUE,size = 200,prob=c(0.00, 0.02, 0.11, 0.30, 0.57 )), C= sample(x=c(1:5),replace = TRUE,size = 200,prob=c(0.00, 0.04, 0.17, 0.30, 0.49 )), D= sample(x=c(1:5),replace = TRUE,size = 200,prob=c(0.01, 0.02, 0.10, 0.28, 0.59 )), E= sample(x=c(1:5),replace = TRUE,size = 200,prob=c(0.01, 0.01, 0.07, 0.28, 0.65 )), ) df <- mxFactor(na.omit(df),levels=c(1,2,3,4,5)) descriptives(df)

set.seed(42)

res <- mx_lca(data = df, classes = 1:3)

marvas74 avatar Jan 19 '24 11:01 marvas74

It's a problem with automatically determinng starting values for the thresholds. You can always manually specify starting values; pending a better programmatic way to determine starting values, there is no fix for this.

cjvanlissa avatar Jan 19 '24 11:01 cjvanlissa

OK, so it needs to be wrapped in try(). Could you give any advice on how to specify starting values?

marvas74 avatar Jan 19 '24 12:01 marvas74

I really can't... but if you set run = FALSE, you can inspect the default ones and make changes!

cjvanlissa avatar Jan 19 '24 12:01 cjvanlissa

I am having the same issue. I have converted the columns using mxFactor to be able to run an LCA. I have used this syntax to run the LCA:

set.seed(123) LCA1 <- tidySEM::mx_lca(data = Wave_1LCA, run = TRUE) LCA1

Could you please explain what specifying starting values means?

EmmaSchleiger avatar Mar 12 '24 10:03 EmmaSchleiger

Would it be possible for you to share the dataset so we can reproduce this issue?

Gootjes avatar Mar 12 '24 11:03 Gootjes

Dear Emma, Since you're using the default setting of classes = 1, you're not even estimating a latent class model at this point. I suspect there is something unusual about the data. Like Gootjes mentioned, it would be easier to diagnose if you can share them. Maybe you could share the output of descriptives(Wave_1LCA)?

cjvanlissa avatar Mar 12 '24 11:03 cjvanlissa

Thank you C.J.,

This is my first time running an LCA (can you tell 🙂). I have chosen the tidySEM package because of its ability to deal with ordinal data and missing values.

This is my descriptives output before I have reclassified them using mxFactor:

    name      type    n    missing unique     mean median mode mode_value        sd        v min max range        skew    skew_2se     kurt kurt_2se

1 q29 integer 3883 0.04828431 4 2.547000 3 3 <NA> 0.6907374 NA 1 4 3 0.01646886 0.2095604 2.760862 17.57001 2 q30 integer 3942 0.03382353 4 2.055809 2 2 <NA> 0.7033688 NA 1 4 3 0.49947305 6.4036843 3.480771 22.31892 3 Sum31_32 character 4080 0.00000000 3 NA NA 2625 1 NA 0.458883 NA NA NA NA NA NA NA 4 q28 integer 3896 0.04509804 5 2.294405 2 2 <NA> 0.7822350 NA 1 5 4 0.35653389 4.5443502 3.447046 21.97350 5 q15_1 integer 3949 0.03210784 7 5.834388 6 6 <NA> 1.1518113 NA 1 7 6 -1.14120040 -14.6441691 4.578280 29.38224 6 q15_2 integer 3944 0.03333333 7 5.823276 6 6 <NA> 1.1220049 NA 1 7 6 -1.12153968 -14.3827707 4.731293 30.34503 7 q15_3 integer 3951 0.03161765 7 5.797773 6 6 <NA> 1.1286174 NA 1 7 6 -1.14759558 -14.7299594 4.841530 31.07957 8 q15_5 integer 3790 0.07107843 7 5.061741 5 5 <NA> 1.3472700 NA 1 7 6 -0.65828176 -8.2755638 3.357330 21.10883 9 q15_6 integer 3802 0.06813725 7 5.140452 5 5 <NA> 1.3084027 NA 1 7 6 -0.71873055 -9.0497750 3.478976 21.90822 10 q15_7 integer 3823 0.06299020 7 5.226785 5 5 <NA> 1.2759476 NA 1 7 6 -0.72899193 -9.2042743 3.575400 22.57745 11 q15_8 integer 3767 0.07671569 7 5.048845 5 5 <NA> 1.3233285 NA 1 7 6 -0.63084023 -7.9065023 3.331089 20.88028 12 q15_10 integer 3838 0.05931373 7 5.061230 5 5 <NA> 1.3273855 NA 1 7 6 -0.69473922 -8.7889768 3.414091 21.60104 13 q15_11 integer 3723 0.08750000 7 5.057749 5 5 <NA> 1.2946573 NA 1 7 6 -0.58577088 -7.2986670 3.275601 20.41236 14 q15_12 integer 3766 0.07696078 7 5.027084 5 5 <NA> 1.2829702 NA 1 7 6 -0.57279581 -7.1780628 3.348301 20.98539 15 q15_13 integer 3711 0.09044118 7 4.834546 5 5 <NA> 1.3632065 NA 1 7 6 -0.45152269 -5.6168765 2.955243 18.38634 16 q15_14 integer 3727 0.08651961 7 4.892407 5 5 <NA> 1.3676987 NA 1 7 6 -0.56472707 -7.0402384 3.099807 19.32723 17 q15_16 integer 3774 0.07500000 7 4.750132 5 5 <NA> 1.3552007 NA 1 7 6 -0.44324726 -5.5605007 2.947353 18.49204 18 q16_1 integer 3926 0.03774510 7 5.380795 6 6 <NA> 1.2600386 NA 1 7 6 -0.94263206 -12.0608405 4.027014 25.76908 19 q16_2 integer 3843 0.05808824 7 5.004163 5 5 <NA> 1.3476736 NA 1 7 6 -0.67375547 -8.5290624 3.315086 20.98826 20 q16_3 integer 3897 0.04485294 7 5.061073 5 5 <NA> 1.3474313 NA 1 7 6 -0.75834587 -9.6670503 3.445659 21.96748 21 q16_4 integer 3782 0.07303922 7 4.676626 5 5 <NA> 1.4143551 NA 1 7 6 -0.50743654 -6.3724880 2.880035 18.08879 22 q16_5 integer 3872 0.05098039 7 4.938275 5 5 <NA> 1.3980693 NA 1 7 6 -0.66040987 -8.3915803 3.220216 20.46431 23 q27_1 integer 3907 0.04240196 7 4.130023 4 4 <NA> 1.6045210 NA 1 7 6 -0.11957560 -1.5262487 2.240766 14.30409 24 q27_2 integer 3937 0.03504902 7 4.504445 5 5 <NA> 1.5293680 NA 1 7 6 -0.35938688 -4.6047354 2.569088 16.46271 25 q27_3 integer 3934 0.03578431 7 4.238943 4 4 <NA> 1.6274872 NA 1 7 6 -0.19992568 -2.5606229 2.283932 14.62986 26 q27_4 integer 3917 0.03995098 7 4.795762 5 5 <NA> 1.4672636 NA 1 7 6 -0.63396330 -8.1021728 3.087046 19.73154 27 q27_5 integer 3937 0.03504902 7 4.749809 5 5 <NA> 1.4852392 NA 1 7 6 -0.57968311 -7.4273367 2.915166 18.68038 28 q27_6 integer 3906 0.04264706 7 4.310292 4 4 <NA> 1.5935149 NA 1 7 6 -0.31173735 -3.9784694 2.389090 15.24898 29 q27_7 integer 3945 0.03308824 7 5.064385 5 5 <NA> 1.2571949 NA 1 7 6 -0.59152230 -7.5867198 3.490816 22.39184 30 q27_8 integer 3929 0.03700980 7 4.872232 5 5 <NA> 1.3615558 NA 1 7 6 -0.58170182 -7.4456314 3.216122 20.58799 31 q27_9 integer 3936 0.03529412 7 4.951474 5 5 <NA> 1.2622695 NA 1 7 6 -0.54413759 -6.9710158 3.294886 21.11095 32 q27_10 integer 3958 0.02990196 7 5.227893 5 5 <NA> 1.1898867 NA 1 7 6 -0.66509080 -8.5443241 3.697386 23.75588 33 q27_11 integer 3837 0.05955882 7 5.022413 5 5 <NA> 1.2793043 NA 1 7 6 -0.40273626 -5.0942552 3.047762 19.28075 34 q27_12 integer 3939 0.03455882 7 5.262503 5 5 <NA> 1.1818601 NA 1 7 6 -0.56353197 -7.2222285 3.339226 21.40319 35 q27_13 integer 3818 0.06421569 7 5.212415 5 5 <NA> 1.2792792 NA 1 7 6 -0.50865490 -6.4180944 3.090411 19.50215 36 q27_14 integer 3809 0.06642157 7 4.353111 4 4 <NA> 1.4788482 NA 1 7 6 -0.13455063 -1.6957294 2.519115 15.87825 37 q11_1_A integer 3848 0.05686275 7 5.197505 5 5 <NA> 1.4010928 NA 1 7 6 -0.79798068 -10.1081921 3.362408 21.30170 38 q11_3_A integer 3839 0.05906863 7 5.031519 5 5 <NA> 1.3728160 NA 1 7 6 -0.62597926 -7.9201421 3.089563 19.55028 39 q11_5_A integer 3858 0.05441176 7 5.167185 5 5 <NA> 1.4203570 NA 1 7 6 -0.75848480 -9.6203557 3.253154 20.63627 40 q11_7_A integer 3818 0.06421569 7 5.012310 5 5 <NA> 1.4073818 NA 1 7 6 -0.59988838 -7.5692582 2.990304 18.87043 41 q12_1_I integer 3708 0.09117647 7 4.890237 5 5 <NA> 1.3428192 NA 1 7 6 -0.59111372 -7.3503981 3.127393 19.44953 42 q12_4_I integer 3759 0.07867647 7 4.856079 5 5 <NA> 1.4003287 NA 1 7 6 -0.62222956 -7.7903036 3.020369 18.91251 43 q12_8_I integer 3700 0.09313725 7 4.351351 4 4 <NA> 1.5170691 NA 1 7 6 -0.24423410 -3.0337341 2.415390 15.00533 44 q13_4_Rx integer 3777 0.07426471 7 4.991792 5 5 <NA> 1.3410431 NA 1 7 6 -0.67231968 -8.4375444 3.213905 20.17242 45 q13_7_Rx integer 3781 0.07328431 7 4.765670 5 5 <NA> 1.5230490 NA 1 7 6 -0.49212299 -6.1793609 2.615481 16.42503 46 q13_8_Rx integer 3780 0.07352941 7 4.913492 5 5 <NA> 1.3865028 NA 1 7 6 -0.62189017 -7.8077557 3.070018 19.27694 47 q13_12_Rx integer 3728 0.08627451 7 4.946888 5 5 <NA> 1.3906318 NA 1 7 6 -0.62116144 -7.7448223 3.085686 19.24177 48 q14_1_Rs integer 3712 0.09019608 7 4.766703 5 5 <NA> 1.3402506 NA 1 7 6 -0.47212304 -5.8739325 2.967296 18.46381 49 q14_2_Rs integer 3768 0.07647059 7 5.135350 5 5 <NA> 1.2881782 NA 1 7 6 -0.75907725 -9.5149956 3.661328 22.95336 50 q14_6_Rs integer 3730 0.08578431 7 4.836997 5 5 <NA> 1.3085209 NA 1 7 6 -0.49506287 -6.1742427 3.073808 19.17283

Dr Emma Schleiger PhD Research Scientist, Digital Futures

Work days, Monday-Thursday Data61 | CSIRO @.@.> | +61 7 3214 2755 | +61 425 555 343

CSIRO acknowledges the Traditional Owners of the land, sea and waters, of the area that we live and work on across Australia. We acknowledge their continuing connection to their culture and we pay our respects to their Elders past and present.

The information contained in this email may be confidential or privileged. Any unauthorised use or disclosure is prohibited. If you have received this email in error, please delete it immediately and notify the sender by return email. Thank you. To the extent permitted by law, CSIRO does not represent, warrant and/or guarantee that the integrity of this communication has been maintained or that the communication is free of errors, virus, interception or interference.

Please consider the environment before printing this email.


From: C. J. van Lissa @.> Sent: Tuesday, 12 March 2024 9:35 PM To: cjvanlissa/tidySEM @.> Cc: Schleiger, Emma (Data61, Dutton Park) @.>; Comment @.> Subject: Re: [cjvanlissa/tidySEM] "Could not complete thresholds; either specify all thresholds by hand, or remove constraints." (Issue #84)

Dear Emma, Since you're using the default setting of classes = 1, you're not even estimating a latent class model at this point. I suspect there is something unusual about the data. Like Gootjes mentioned, it would be easier to diagnose if you can share them. Maybe you could share the output of descriptives(Wave_1LCA)?

— Reply to this email directly, view it on GitHubhttps://github.com/cjvanlissa/tidySEM/issues/84#issuecomment-1991448267, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BAMBJ5MT56WZII42KQ3ZRNDYX3R7BAVCNFSM6AAAAABBGUXRZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJRGQ2DQMRWG4. You are receiving this because you commented.

EmmaSchleiger avatar Mar 12 '24 11:03 EmmaSchleiger

3 Sum31_32 character 4080 0.00000000 3 NA NA 2625 1 NA 0.458883 NA NA NA NA NA NA NA

This line shows Sum31_32 is of type character. If that variable is included in the call to mx_lca you should have gotten the error Function mx_lca() only accepts data of an ordinal (binary or ordered categorical) level of measurement.. So I am guessing this variable wasn't included in the call to mx_lca? I am just guessing.

Which version of tidySEM are you running?

Gootjes avatar Mar 12 '24 12:03 Gootjes

The above descriptives I shared were before I used the mxFactor function to reclassify them as factors, though I have noticed that when using that mFxFactor function it doesn't actually change the type from 'integer' to 'factor' like the as.factor function does.

However, looking at the variable Sum31_32, there is something strange going on with it. I have omitted it from the model and retested using:

set.seed(123) LCA1 <- tidySEM::mx_lca(data = Wave_1LCA, classes= 1:5, run = TRUE) LCA1 Fit1 <- table_fit(LCA1) fit[, c("Name", "LL", "n", "Parameters", "BIC", "Entropy", "prob_min", "n_min", "np_ratio", "np_local")]

My computer is currently trying to run it, but it may jut be too complicated/too many variables for one model and I need to collapse down some of the categories.

I have a secondary question, I couldn't find in any of the papers, vignettes etc, how you can specify a FIML method for the LCA model. Do you have any additional advice on that?

EmmaSchleiger avatar Mar 12 '24 12:03 EmmaSchleiger

First of all, you mention that these are not the data as you fed them to mx_lca() - but I do need to see exactly what went into that function call!

Looking at these descriptives, I also notice that there's a character variable. I'm not sure which variables you converted to ordinal, but if it's all of the integer scales as well, then the parameter space becomes incredibly large, definitely intractible in any practical sense - because you are estimating a latent crosstable with all unique combinations of all categories in all variables. Imagine how huge that crosstable is, and how many empty cells there will be, making estimation near-impossible!

Aside from that, I've also observed that models with mixed data types (e.g., mixing categorical and numeric variables) tend to have trouble converging. Not sure why that is.

In your case, I would probably treat all variables as numeric and use mx_profiles, perhaps omitting the single categorical item or maybe try setting its variance to 0.

cjvanlissa avatar Mar 12 '24 12:03 cjvanlissa

I have a secondary question, I couldn't find in any of the papers, vignettes etc, how you can specify a FIML method for the LCA model. Do you have any additional advice on that?

Regarding your secondary question, cjvanlissa should know this better than me, but the tidySEM package is in this case creating a model using the OpenMx package. This package does FIML by default for models with fit function mxFitFunctionML. mx_lca is such a model, so it does FIML. Source: https://openmx.ssri.psu.edu/comment/8112#comment-8112

Gootjes avatar Mar 12 '24 12:03 Gootjes

Yep, Gootjes is correct about FIML. It's the default estimation method.

cjvanlissa avatar Mar 12 '24 12:03 cjvanlissa

Thank you both for your help. If you ever decide to do a tutorial with examples of how to run LCA for non-experts that would be great. There isn't much entry-level information about it, especially including examples using tidySEM.

All the variables in the descriptives I sent are to be inlcuded in the analysis. All except one are Likert scale responses so they range from 1-7. I don't think treating them as numerical would be accurate. So perhaps I need to go back and reduce the number of variables in the analysis as it may be too complicated to process.

EmmaSchleiger avatar Mar 12 '24 22:03 EmmaSchleiger

I have now cleaned up my data and formatted them correctly with the mxFactor function. I am still getting the "Error in update_thresholds(zscore) : Could not complete thresholds; either specify all thresholds by hand, or remove constraints." if I try and use a set.seed(123). If I don't use set.seed it does run.

I attached my updated descriptives

image

EmmaSchleiger avatar Mar 13 '24 11:03 EmmaSchleiger

Dear Emma,

There is a tutorial paper here https://doi.org/10.1080/10705511.2023.2250920, and four vignettes in the package, on the package website, and in the paper supplementary materials!

As mentioned before, I still think that the problem is that there are too many empty cells in the latent crosstable. I would tread the indicators as continuous, which greatly simplifies estimation.

cjvanlissa avatar Mar 13 '24 16:03 cjvanlissa