mboost icon indicating copy to clipboard operation
mboost copied to clipboard

excessive runtime of coef.mboost

Open bastistician opened this issue 4 years ago • 2 comments

Profiling coef.mboost for a colleague, I have noticed that it seems to waste time to "check if base-learner has coefficients" here:

https://github.com/boost-R/mboost/blob/021e406f40018b23727f6b75c1f2297e82cc1c7c/R/mboost.R#L430

I don't know what the original intention was (@hofnerb), but given that cf is a numeric matrix, I think that condition will always be FALSE.

In the particular application, the base learner was a Markov random field over 327 regions. Here are the runtimes I get for evaluating the above condition as a function of mstop:

for (mstop in 10^(2:5)) {
  cat(mstop, ": ")
  cf <- matrix(0, 327, mstop)  # the actual values are irrelevant for this runtime assessment
  cat(system.time(any(sapply(cf, is.null)))["elapsed"], "s\n")
}
#> 100 : 0.009 s
#> 1000 : 0.109 s
#> 10000 : 1.094 s
#> 1e+05 : 20.04 s

Maybe that condition should be fixed to really "check if base-learner has coefficients". Otherwise the ret <- NULL case could simply be dropped. It would fail anyway when trying to set names on NULL later on.

bastistician avatar Aug 12 '21 15:08 bastistician

The intention is that not all baselearners have coefficients. If I am remembering correctly the only BL without coefficients is btree, though:

library("mboost")
mod <- mboost(dist ~ btree(speed), data = cars)
coef(mod)
# $`btree(speed)`
# NULL
# 
# attr(,"offset")
# [1] 42.98

I just checked your fix propsal. It breaks with the following error

Fehler in base::rowSums(x, na.rm = na.rm, dims = dims, ...) : 
  'x' muss numerisch sein

We hence cannot simply remove the check. I am wondering a bit, though, why we have the sapply bit. Are there lists of coefficients or does (is.null(cf)) not work for specific coefficient objects we might have?

hofnerb avatar Aug 13 '21 10:08 hofnerb

Ok, I see. Maybe the check could be done earlier and more efficiently? Something like https://github.com/bastistician/mboost/commit/50a03928f6d884bbd3a73c2115b1f26a4d3f00ce ? Of course, being an mboost noob, I have no idea whether checking the first element for being NULL is sufficient to conclude that the base learner has no coefs.

bastistician avatar Aug 13 '21 12:08 bastistician