plspm icon indicating copy to clipboard operation
plspm copied to clipboard

get_boots() and handling of missing values

Open guilhemchalancon opened this issue 10 years ago • 0 comments

version: 78810865a29fae600e5518615ba26f2df5c93747

I ran into a bug when using models that include missing values in the input data. The bug only appears when boot.val = T, so I looked into the get_boots() function.

To see the bug, you might simply run the toy data example with boot.val = TRUE:

# let's add missing values to russa
russNA = russa
russNA[1,1] = NA
russNA[4,4] = NA
russNA[6,6] = NA

# PLS-PM using data set 'russa'
rus_pls6 = plspm(russNA, rus_path, rus_blocks, scaling = rus_scaling, 
    modes = rus_modes, scheme = "centroid", plscomp = c(1,1,1), boot.val = TRUE)

I found out that the problem appears when the cross-loadings are computed. These are obtained with the function cor: xloads = cor(X, Y.lvs)

Incidentally, the default behaviour of cor is not great: it doesn't handle missing values by default, and thus creates rows full of NAs in xloads whenever a column in X contains NAs.

Solution: xloads = cor(X, Y.lvs, use="complete.obs")

Where (in the get_boots.r source file) : both in the initiation of all values (line 54) and in the while loop (line 105).

guilhemchalancon avatar May 19 '14 13:05 guilhemchalancon