seminr icon indicating copy to clipboard operation
seminr copied to clipboard

Error when calling summary() for HTMT in bootstrap_model

Open hey0wing opened this issue 2 years ago • 1 comments

I would like to check the htmt by using bootstrap confidence intervals

When nboot is set to a larger number, e.g. 1000, the following error occured

Error in if (original_matrix[i, j]/stats::sd(boot_array[i, j, ]) > 999999999) { : 
  missing value where TRUE/FALSE needed

If nboot is reduced to 100, no error was found.

  • Is there a recommended number of boot?
  • If the sample size is small (only 22 for now), is it suitable to do PLS-SEM using bootstrap approach?

EDIT on 2023/12/08:

I dive deeper into the code, and fine the corresponding line of code

report_paths_and_intervals.R, line 262

parse_boot_array <- function(original_matrix, boot_array, alpha = 0.05) {
  ...
          if (original_matrix[i,j]/ stats::sd(boot_array[i,j,]) > 999999999) {}
  ...
}

which is called from report_summary.r, line 59-68 for 5 times.

The problem is that the boot_HTMT value in one of the model is Inf

htmt_summary <- parse_boot_array(HTMT(object), object$boot_HTMT, alpha = alpha)

The HTMT() is from evaluate_validity.R, line 46
It seems that the Inf is probably due to divided by zero of MTHM (or MTMM?), such that sum(cor_matrix[!lower.tri(cor_matrix)]) is zero. My first suspicion is that the there could be negative correlation, but the matrix is calculating the HTMT+ (Ringle et al., 2023) instead of the original HTMT (Henseler et al., 2015) by setting cor_matrix <- abs(...).
In this case, I think the root cause there are only a few items (maybe 2?) within a construct, and the correlation matrix is zero in one of the bootstrapped model due to subsampling problem in small sample size. However, we should at least have a warning message about the situation of zero correlation, rather than throwing the error directly.

hey0wing avatar Oct 16 '23 16:10 hey0wing

Another important issue is that the t-value for bootstrapped HTMT seems to be incorrect. From the function of parse_boot_array(), it is calculated by the original value divided by the sd of bootstrapped value.

Take the following result as example,

Original Est. Bootstrap Mean Bootstrap SD T Stat. 2.5% CI 97.5% CI
A -> B 0.705 0.787 0.206 3.428 0.436 1.237
B -> C 0.709 0.737 0.100 7.065 0.534 0.920

Henseler et al. (2015, p. 112) stated that

in order to test the null hypothesis (H0: HTMT ≥ 1) against the alternative hypothesis (H1: HTMT < 1). A confidence interval containing the value one (i.e., H0 holds) indicates a lack of discriminant validity.

With similar original est. of HTMT, both of the bootstrapped test is significant in t-value, but the value one clearly lines within the 95% CI in the "A -> B". Such that the use of significance seems to be different from the original papre

Should the correct formula be (1 - Original Est.) / Bootstrap SD, to be similar to a one-sample t-test? If the suggested approach is used, the new t-value would reflect the accurate rejection of H0.

Original Est. Bootstrap Mean Bootstrap SD T Stat. 2.5% CI 97.5% CI
A -> B 0.705 0.787 0.206 1.432 0.436 1.237
B -> C 0.709 0.737 0.100 2.910 0.534 0.920

hey0wing avatar Dec 11 '23 05:12 hey0wing