dbarts icon indicating copy to clipboard operation
dbarts copied to clipboard

'pdbart' for categorical variables works only if 'xind' is integer

Open AMBarbosa opened this issue 2 years ago • 0 comments

As per ?pdbart, the 'xind' argument can be provided as "Integer, character vector, or the right-hand side of a formula indicating which variables are to be plotted". This is true for continuous variables; However, for categorical variables, only integers work, while character or formula inputs return an error. Here's a reproducible example:

# generate some data as in ?bart examples:
f <- function(x) {
  10 * sin(pi * x[,1] * x[,2]) + 20 * (x[,3] - 0.5)^2 +
    10 * x[,4] + 5 * x[,5]
}
set.seed(99)
sigma <- 1.0
n     <- 100
x  <- matrix(runif(n * 10), n, 10)
Ey <- f(x)
y  <- rnorm(n, Ey, sigma)

# make one of the x variables categorical:
x <- data.frame(x)
x[,1] <- ifelse(x[,1] > mean(x[,1]), "high", "low")
head(x)

# fit a bart model:
set.seed(99)
bartFit <- bart(x, y, keeptrees = TRUE)

# compute partial dependence:
pdbart(bartFit, xind = "X2")  # for a continuous variable it works
pdbart(bartFit, xind = "X1")  # for the categorical variable, "Error in pdbart(bartFit, xind = "X1") : unrecognized columns 'X1'"
pdbart(bartFit, xind = X1)  # also "Error in pdbart(bartFit, xind = X1) : unrecognized columns 'X1'"
# but it works (even for the categorical variable) if we provide the position instead of the name:
pdbart(bartFit, xind = 1)

If this can't be (easily) fixed, a mention under "Arguments - xind" in the help about this argument would be useful. Regards and thanks for the great package!

AMBarbosa avatar Feb 19 '23 19:02 AMBarbosa