hstats icon indicating copy to clipboard operation
hstats copied to clipboard

hstats fails with cryptic error: String '_:_' found in grid values at unlucky position.

Open inkrement opened this issue 9 months ago • 9 comments

First of all, many thanks for this great package!

I'm currently trying to apply hstats to a causal forest (grf) object. However, as soon as I attempt to compute 2-way interactions (i.e., pairwise_m > 0), the function fails with an error that seems a bit cryptic.

From what I can tell, it is due to a hardcoded check—but it's unclear to me why this happens and how to work around it. I’d really appreciate it if you could point me in the right direction.


# wrapper for grf to work with partial_dep
pred_fun <- function(object, newdata, ...) {
  predict(object, newdata, ...)$predictions
}

H <- hstats(cf_grf, X = X, pred_fun = pred_fun, verbose = TRUE, approx=TRUE) 
Error in .compress_grid(grid = grid): String '_:_' found in grid values at unlucky position.
Traceback:

1. hstats(cf_full, X = X, pred_fun = pred_fun, verbose = TRUE, approx = TRUE, 
 .     n_max = 100)
2. hstats.default(cf_full, X = X, pred_fun = pred_fun, verbose = TRUE, 
 .     approx = TRUE, n_max = 100)
3. mway(object, v = v2, X = X, pred_fun = pred_fun, w = w, way = 2L, 
 .     verb = verbose, ...)
4. wcenter(pd_raw(object, v = z, X = X, grid = X[, z], pred_fun = pred_fun, 
 .     w = w, ...), w = w)
5. pd_raw(object, v = z, X = X, grid = X[, z], pred_fun = pred_fun, 
 .     w = w, ...)
6. .compress_grid(grid = grid)
7. stop("String '_:_' found in grid values at unlucky position.")

inkrement avatar Apr 17 '25 10:04 inkrement

Ha - do you have feature values that contain the substring "_:_"?

mayer79 avatar Apr 17 '25 13:04 mayer79

Thank you for the swift reply! no, only underscores and numbers.

inkrement avatar Apr 17 '25 13:04 inkrement

I will dig into it, thx for reporting.

mayer79 avatar Apr 17 '25 14:04 mayer79

Currently, I can reproduce only what is implemented:

If feature values contain the string "_:_", an internal function would fail. Thus, it yields an explicit error.

grid <- data.frame(X = c("", "", "_:_"), Y = c("_:_", "_:_", ""))
hstats:::.compress_grid(grid)

Could you please add a minimal reproducing example based on your setting?

mayer79 avatar Apr 18 '25 08:04 mayer79

I don't know what happend, but I just excluded a highly correlated variable. The name did not contain any double points; only alphapetic characters. Anyhow, I'll close it for now.

inkrement avatar Apr 30 '25 09:04 inkrement

Just for curiosity,: Did any variable contain values with a ":" in it?

mayer79 avatar May 01 '25 05:05 mayer79

No, just [A-Za-z_].

inkrement avatar May 02 '25 12:05 inkrement

I am using a logic almost 1:1 copied from R's merge() function to find row match indices over multiple columns. However, they use as column separater "\r", while I am using that cryptic ":", which lead to the error message. I will switch to "\r", which seems to be a better solution.

mayer79 avatar May 05 '25 12:05 mayer79

sounds good! I'll give it a try afterwards.

inkrement avatar May 06 '25 09:05 inkrement

@inkrement Would be fantastic if you could test your original (failing) example with the main branch

devtools::install_github("ModelOriented/hstats")

mayer79 avatar May 06 '25 20:05 mayer79

sure! But it will be next week.

inkrement avatar May 07 '25 18:05 inkrement