equatiomatic icon indicating copy to clipboard operation
equatiomatic copied to clipboard

Problems with `poly`

Open TylerGrantSmith opened this issue 3 years ago • 1 comments

I have recently had the need to extract poly data from some models and I was hoping {equatiomatic} had a solution, but I think I need to point out some inaccuracies.

First, it appears the translation fails when poly has max-degree 1, illustrated below (malformed TeX with caret and no power).

Second, poly by default uses orthogonal polynomials, this means that the coefficients given in the associated poly object of the model are not the coefficients of the powers of the unadjusted covariate. This is illustrated below where mod1 and mod3 (using raw polynomials) reduce identical output, as they should, but mod2 equation is incorrect as its coefficients apply to the transformed disp. This can be seen with mod4 where I've explicitly performed the orthogonalization process.

library(equatiomatic)

# no poly
mod1 <- lm(mpg ~ disp, mtcars)

# orthogonal poly
mod2 <- lm(mpg ~ poly(disp, 1), mtcars)

# non-orthogonal poly
mod3 <- lm(mpg ~ poly(disp, 1, raw = T), mtcars)

# `mod2`: equation is incorrect because the coefficients apply to transformed inputs
# `mod2` and `mod3`: poly handling seems to break down when the input is degree 1.
cat(extract_eq(mod1, use_coefs = T))
#> \operatorname{\widehat{mpg}} = 29.6 - 0.04(\operatorname{disp})
cat(extract_eq(mod2, use_coefs = T))
#> \operatorname{\widehat{mpg}} = 20.09 - 28.44(\operatorname{disp^})
cat(extract_eq(mod3, use_coefs = T))
#> \operatorname{\widehat{mpg}} = 29.6 - 0.04(\operatorname{disp,\ 1^})


# Explicitly perform the transformations that `poly` does when using orthogonal polynomials
mtcars2 <- mtcars
mtcars2$disp <- scale(mtcars2$disp)
mtcars2$disp <- mtcars2$disp / sqrt(sum(mtcars2$disp^2))

mod4 <- lm(mpg ~ disp, mtcars2)
mod5 <- lm(mpg ~ poly(disp, 1, raw = T), mtcars2)

cat(extract_eq(mod2, use_coefs = T))
#> \operatorname{\widehat{mpg}} = 20.09 - 28.44(\operatorname{disp^})
cat(extract_eq(mod4, use_coefs = T))
#> \operatorname{\widehat{mpg}} = 20.09 - 28.44(\operatorname{disp})
cat(extract_eq(mod5, use_coefs = T))
#> \operatorname{\widehat{mpg}} = 20.09 - 28.44(\operatorname{disp,\ 1^})

Relevant SO discussions:

  • https://stats.stackexchange.com/questions/31858/recovering-raw-coefficients-and-variances-from-orthogonal-polynomial-regression
  • https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154

TylerGrantSmith avatar Feb 02 '22 20:02 TylerGrantSmith

Thanks - to be totally honest I never actually tested the branch that included poly() with use_coefs = TRUE so I'm not surprised to see some errors here. I appreciate the issue and I'll try to get it resolved relatively soon.

datalorax avatar Feb 02 '22 21:02 datalorax