dbarts
dbarts copied to clipboard
'pdbart' for categorical variables works only if 'xind' is integer
As per ?pdbart, the 'xind' argument can be provided as "Integer, character vector, or the right-hand side of a formula indicating which variables are to be plotted". This is true for continuous variables; However, for categorical variables, only integers work, while character or formula inputs return an error. Here's a reproducible example:
# generate some data as in ?bart examples:
f <- function(x) {
10 * sin(pi * x[,1] * x[,2]) + 20 * (x[,3] - 0.5)^2 +
10 * x[,4] + 5 * x[,5]
}
set.seed(99)
sigma <- 1.0
n <- 100
x <- matrix(runif(n * 10), n, 10)
Ey <- f(x)
y <- rnorm(n, Ey, sigma)
# make one of the x variables categorical:
x <- data.frame(x)
x[,1] <- ifelse(x[,1] > mean(x[,1]), "high", "low")
head(x)
# fit a bart model:
set.seed(99)
bartFit <- bart(x, y, keeptrees = TRUE)
# compute partial dependence:
pdbart(bartFit, xind = "X2") # for a continuous variable it works
pdbart(bartFit, xind = "X1") # for the categorical variable, "Error in pdbart(bartFit, xind = "X1") : unrecognized columns 'X1'"
pdbart(bartFit, xind = X1) # also "Error in pdbart(bartFit, xind = X1) : unrecognized columns 'X1'"
# but it works (even for the categorical variable) if we provide the position instead of the name:
pdbart(bartFit, xind = 1)
If this can't be (easily) fixed, a mention under "Arguments - xind" in the help about this argument would be useful. Regards and thanks for the great package!