rms icon indicating copy to clipboard operation
rms copied to clipboard

ols() fails for interactions without base term

Open wngrtn opened this issue 9 years ago • 7 comments

I want to estimate an OLS model with an interaction term, but I don't want to include all variables from the interaction individually. ols calls Design, which is unable to handle this case. For example

y <- rnorm(100)
x1 <- rnorm(100)
x2 <- rnorm(100)
ols(y ~ x1 + x1:x2)

returns:

Error in if (!length(fname) || !any(fname == zname)) { : 
  missing value where TRUE/FALSE needed

While a statistician may simply demand the inclusion of x2, there are legitimate cases where it is not feasible or desirable. For example, x2 may only vary across entities for which fixed effects are included in the model.

wngrtn avatar May 07 '15 08:05 wngrtn

This is an incorrect model so rms does not allow it. You must use *.

On 05/07/2015 03:14 AM, wngrtn wrote:

I want to estimate an OLS model with an interaction term, but I don't want to include all variables from the interaction individually. |ols| calls |Design|, which is unable to handle this case. For example

y <- rnorm(100) x1 <- rnorm(100) x2 <- rnorm(100) ols(y ~ x1 + x1:x2)

returns:

|Error in if (!length(fname) || !any(fname == zname)) { : missing value where TRUE/FALSE needed |

While a statistician may simply demand the inclusion of |x2|, there are legitimate cases where it is not feasible or desirable. For example, |x2| may only vary across entities for which fixed effects are included in the model.

— Reply to this email directly or view it on GitHub https://github.com/harrelfe/rms/issues/8.


Frank E Harrell Jr Professor and Chairman School of Medicine

Department of *Biostatistics*   *Vanderbilt University*

harrelfe avatar May 07 '15 13:05 harrelfe

Using x1*x2 automatically includes x1, x2 and the interaction between the two. What is the correct way to include the interaction without including the individual terms?

wngrtn avatar May 07 '15 14:05 wngrtn

It is not appropriate to do so.

On 05/07/2015 09:10 AM, wngrtn wrote:

Using x1*x2 automatically includes x1, x2 and the interaction between the two. What is the correct way to include the interaction without including the individual terms?

— Reply to this email directly or view it on GitHub https://github.com/harrelfe/rms/issues/8#issuecomment-99881337.


Frank E Harrell Jr Professor and Chairman School of Medicine

Department of *Biostatistics*   *Vanderbilt University*

harrelfe avatar May 07 '15 15:05 harrelfe

I gave a more complicated example above, but assume that I know that the data-generating process does not include the main effect. Would you say that I am still forbidden from omitting the main effects in the regression?

wngrtn avatar May 07 '15 16:05 wngrtn

Almost always, because you would be making a magical assumption about the origin (zero point/centering) of the variables.

On 05/07/2015 11:24 AM, wngrtn wrote:

I gave a more complicated example above, but assume that I know that the data-generating process does not include the main effect. Would you say that I am still forbidden from omitting the main effects in the regression?

— Reply to this email directly or view it on GitHub https://github.com/harrelfe/rms/issues/8#issuecomment-99927255.


Frank E Harrell Jr Professor and Chairman School of Medicine

Department of *Biostatistics*   *Vanderbilt University*

harrelfe avatar May 07 '15 16:05 harrelfe

Point taken. Frankly, I thought this example would be a slam-dunk, but it isn't. Allow me to go back to my use case of including an interaction which contains a main effect that is captured by a fixed effect; for example, if one of the interacted variables is time-invariant in a country-year fixed effects model. If I apply the appropriate error term correction to the coefficient, am I still not allowed to include the interaction in my model?

wngrtn avatar May 07 '15 16:05 wngrtn

Why the term 'use case' instead of 'example'?

This is a better question for stats.stackexchange.com using the tag: regression-strategies

Please provide a specific example of what you are trying to accomplish.

On 05/07/2015 11:38 AM, wngrtn wrote:

Point taken. Frankly, I thought this example would be a slam-dunk, but it isn't. Allow me to go back to my use case of including an interaction which contains a main effect that is captured by a fixed effect; for example, if one of the interacted variables is time-invariant in a country-year fixed effects model. If I apply the appropriate error term correction to the coefficient, am I still not allowed to include the interaction in my model?

— Reply to this email directly or view it on GitHub https://github.com/harrelfe/rms/issues/8#issuecomment-99932445.


Frank E Harrell Jr Professor and Chairman School of Medicine

Department of *Biostatistics*   *Vanderbilt University*

harrelfe avatar May 07 '15 16:05 harrelfe