songbird icon indicating copy to clipboard operation
songbird copied to clipboard

Add explanations for what exactly "Intercept" differentials mean

Open fedarko opened this issue 5 years ago • 4 comments

This has come up before, but I'm making it an issue here so it's officially written down somewhere.

From discussion with @antgonza and many other people :) Relates to biocore/qurro#229.

fedarko avatar Oct 03 '19 20:10 fedarko

Yea, I need to write up a blog post on this - that'll be up within the next 3 weeks

mortonjt avatar Oct 04 '19 14:10 mortonjt

Was chatting with @antgonza today about formula stuff, and I found this video from one of the Patsy devs -- it does a super good job explaining both categorical encodings and intercept stuff.

A few relevant timestamps:

  • Intercept stuff: around 4:45
    • Explanation about why reference categories are needed in general: around 8:35
  • Treatment coding stuff: around 3:12

For "normal" uses of Patsy the intercept is the mean of whatever the "reference" group is, and everything else represents differences from this mean. So e.g. in the OLS example data on the screen at around 6:40, the Intercept coefficient (group 1 reference) is 46.4583, and the group 2 coefficient is 11.5417. And when you set group 2 as the reference instead, the group 1 coefficient is -11.5417 (because things have been flipped now), and the group 2 coefficient is 58 (aka 46.4583 + 11.5417).

I'm not quite sure how this translates to an interpretation of the Intercept differentials you get, but at the very least it'd be good to add a link to this video to the README in the future.

fedarko avatar Feb 04 '20 08:02 fedarko

Thanks for raising this issue, fedarko! I had the same question.

senaj avatar Mar 10 '20 20:03 senaj

for reference, @mortonjt has written a blog post here explaining this in the context of Songbird. We may want to add a link to this from the README.

fedarko avatar May 19 '20 21:05 fedarko