loo future support

It'd be great if loo allowed parallelisation via the future package in addition to the existing cores argument. That would make it easy to e.g. use a job scheduler or a remote server with more RAM to compute loo, from the comfort of your desktop. This works quite well in brms, which takes a boolean future argument.

Aug 07 '18 12:08 rubenarslan

Could you 1) give a pointer to an example in brms, 2) include an example code how you would use it in loo, 2) and tell what is the case where loo takes so much time that you would like to use this? I'm asking 3), because in some cases the approach mentioned in issue #87 would sufficient.

Aug 13 '18 07:08 avehtari

Sorry, I'm not sure I understand you correctly. In brms, instead of calling

 model <- brm(value  ~ abunch + of_predictors + (1 + abunch + of_predictors | short) + (1 + (abunch + of_predictors | variable),
                   family = sratio(threshold = "equidistant"),
                   cores = 4,
                   data = diary)

I can also run the following to send the brms script to our remote server (which has more RAM and more cores and doesn't block my laptop)

library(future)
login <- tweak(remote, workers = "arslan@xxx")
plan(list(remote_login, multicore))
model %<-% brm(value  ~ abunch + of_predictors + (1 + abunch + of_predictors | short) + (1 + (abunch + of_predictors | variable),
                   family = sratio(threshold = "equidistant"),
                   future = TRUE,
                   data = diary)

Instead of loo_m1 <- loo(model, cores = 4) I'd run

plan(multicore)
loo_m1 <- loo(model, future = TRUE)

(or similar, depending on my needs).

I thought you saw the need for this, because the cores argument exists already. For me, it's not mainly about taking a lot of time (compared to the time needed to fit the model, still upwards of 1 hour sometimes), but about taking a lot of RAM, for example for a model with 1.4m rows, and group-varying slopes for two grouping variables. Using future, I can more conveniently send a job to a remote computer with lots of RAM. If taking m folds is just as good, that's great too. If you think the cores argument rarely provides a benefit, I defer to your knowledge. I didn't even test that. I assumed it was useful, because it exists.

Aug 13 '18 09:08 rubenarslan

Thanks for the examples. I've never used future, so these were helpful to understand.
If model is brms model then this might be a brms package issue instead of loo package issue, as brms has it's own loo wrapper. Pinging @paul-buerkner . We an also discuss with Paul and Jonathan at StanCon about possibility to use future in loo package, too.

I thought you saw the need for this, because the cores argument exists already.

I definitely understand the benefits, but it's also very common that when someone mentions that loo is slow that 1) its not needed, 2) can be computed only partially, 3) should use something else than loo, so I have the habit to ask and then sometimes I learn about a new case where full loo is actually what is needed.

still upwards of 1 hour sometimes

Wow, that's a lot. Are you sure you need loo?

for example for a model with 1.4m rows, and group-varying slopes for two grouping variables

How many parameters? If you have less than 14k parameters, it's likely that you don't need loo, and if you have more than 140k parameters stan-dev team is happy to hear more about the performance of the computation. If you want to discuss more about loo and alternatives, it would better to continue at dicsourse.mc-stan.org as this is not crucial for the issue although convincing example where 1 hour loo is needed will motivate us to prioritize possible implementation.

Using future, I can more conveniently send a job to a remote computer with lots of RAM.

Sure, and I can imagine it would be useful also with kfold(). I added enhancement label for this issue.

Aug 13 '18 16:08 avehtari

@paul-buerkner sent me here, so I think it would have to live in loo. And yes, it makes even more sense for kfold.

Thanks for the inquiry about loo being slow. I've never timed loo on its own, so I was going off vague estimates of a job that had wrapped up and saved a model but was still running loo. I'll run similar models in a few weeks and come up with a real number. I have less than 14k parameters. To be clear, I would be able to run loo on a subsample of folds and still get a good approximation, but this is not yet implemented, right?

Aug 15 '18 12:08 rubenarslan

To be clear, I would be able to run loo on a subsample of folds and still get a good approximation, but this is not yet implemented, right?

Probably yes and you can estimate how good the approximation is. It's not yet implemented in loo but there is an issue #87 for it. See also discussion http://discourse.mc-stan.org/t/model-comparison-big-data/4947. Please continue further discussion which is not related to future support in discourse or in issue #87, so that your questions and comments do no get lost here.

Aug 16 '18 16:08 avehtari

loo loo copied to clipboard

future support

loo
loo copied to clipboard