rstanarm icon indicating copy to clipboard operation
rstanarm copied to clipboard

MRP vignette / case study: questions concerning interaction between "male" and "age"

Open fweber144 opened this issue 4 years ago • 4 comments

In the MRP vignette / case study, the model is formulated as follows:

cat_pref ~ factor(male) + factor(male) * factor(age) + 
    (1 | state) + (1 | age) + (1 | eth) + (1 | income)

My first question is if this shouldn't be

cat_pref ~ factor(male) + factor(male) : factor(age) + 
    (1 | state) + (1 | age) + (1 | eth) + (1 | income)

since factor(male) * factor(age) is equal to factor(male) + factor(age) + factor(male) : factor(age) and thus adds non-varying main effects for the age groups (which already have varying intercepts).

My second question is if it wouldn't be better (in terms of consistency) to use the following model:

cat_pref ~ factor(male) + 
    (1 | state) + (1 + factor(male) | age) + (1 | eth) + (1 | income)

i.e. to use varying slopes for male. Or was there a reason for having varying intercepts for age, but non-varying male:age interactions?

fweber144 avatar May 28 '20 11:05 fweber144

@fweber144 Thanks for the comments/questions/suggestions!

@lauken13

My first question is if this shouldn't be

cat_pref ~ factor(male) + factor(male) : factor(age) + 
    (1 | state) + (1 | age) + (1 | eth) + (1 | income)

since factor(male) * factor(age) is equal to factor(male) + factor(age) + factor(male) : factor(age) and thus adds non-varying main effects for the age groups (which already have varying intercepts).

Yeah I think you're right that it would make more sense. @lauken13 what do you think?

My second question is if it wouldn't be better (in terms of consistency) to use the following model:

cat_pref ~ factor(male) + 
    (1 | state) + (1 + factor(male) | age) + (1 | eth) + (1 | income)

i.e. to use varying slopes for male. Or was there a reason for having varying intercepts for age, but non-varying male:age interactions?

In some sense I think you're right, but one issue is that there are only two levels of the male variable and it's hard to estimate varying slopes (in particular the hierarchical standard deviation) from only two levels. @lauken13 am I remembering correctly that this was the reason for doing this?

jgabry avatar May 28 '20 16:05 jgabry

Nah I think you're right @fweber144 . There's a couple issues with the model we used in this case study, one of which is this and one is the priors we used (the default priors are not great for the logistic regression we use). It's been on my todo list but it keeps getting bumped.

@jgabry I'll try to make revisions in the next couple days and push.

lauken13 avatar May 29 '20 01:05 lauken13

@jgabry @lauken13 Thanks for your quick replies. Concerning @jgabry 's comment

In some sense I think you're right, but one issue is that there are only two levels of the male variable and it's hard to estimate varying slopes (in particular the hierarchical standard deviation) from only two levels.

I think this does not apply here as the grouping variable is age (which has 7 levels). It would indeed apply if male was used as a grouping variable, but that's not the case here (exactly for this reason, I was assuming).

fweber144 avatar May 29 '20 07:05 fweber144

I think this does not apply here as the grouping variable is age (which has 7 levels). It would indeed apply if male was used as a grouping variable, but that's not the case here (exactly for this reason, I was assuming).

Yeah, you're right. I thought you meant using male as a grouping variable but I think I read your initial question too quickly and looking back now what you said makes sense to me! Thanks for following up.

jgabry avatar Jun 01 '20 16:06 jgabry