panelr icon indicating copy to clipboard operation
panelr copied to clipboard

predict() for "within" model appears to add random intercept twice?

Open dani-k-s opened this issue 4 years ago • 2 comments

Thank you very much for this great package.

A colleague of mine seems to be running into problems with the predict() function and I am not sure whether this may hint at a bug or whether it is down to user error. We have fitted a "within" model. When we use the predict() function to just "predict" the observed values, they are not very close to the observations and compared to doing the predictions manually, they are always off by exactly the random effect. So if the random effect is positive, the predictions using the predict() function are increased by the random intercept compared to the manual predictions. If the random effect is negative they are decreased by the random effect compared to the manual predictions. I should add that we do include the random effect in our manual predictions.

Unfortunately I cannot share the data/code easily but if you do not have this problem with your test datasets we can maybe make up a mock version that we can share. Any help would be much appreciated.

dani-k-s avatar Feb 02 '21 14:02 dani-k-s

Not sure how I missed this report, but have just now looked into it. It looks like the default behavior when the re.form argument isn't specified is different than what lme4::predict.merMod() does in the same situation.

Using the model from the wbm() documentation example, when just using predict(model), the outputs are what you would expect. If I instead enter predict(model, newdata = model.frame(model), raw = TRUE), then the random effects are not included in the predictions --- it makes the predictions basically as if every observation belongs to a hypothetical average id. When using newdata, to get the random effects added back in, the formula will need to be used like so:

predict(model, newdata = model.frame(model), raw = TRUE, re.form = ~ (1 | id))

Then it would match the output from predict(model).

That being said, I'm not sure that I should be deviating from the default behavior from predict.merMod() since it clearly takes users by surprise that I do. This potential issue is upstream of panelr since I make the final predictions in jtools and its predict_merMod() function. I will need to decide one of the following:

  • Change nothing (probably not ideal)
  • Change the default in jtools (will need to recall why I created the default behavior in jtools, where I am often creating predictions in which the random effects are intentionally ignored)
  • Change the default in panelr by changing the default argument to re.form or an equivalent solution (will require me to make sure I can automatically extract the random effects portion of the formula from the model)

jacob-long avatar Jan 12 '23 15:01 jacob-long

And as I check another thing, the documentation is unequivocally wrong about the behavior when re.form is NULL — it borrows the description from lme4::predict.merMod() — so I need to change that in panelr and perhaps jtools as well.

jacob-long avatar Jan 12 '23 15:01 jacob-long