pingouin icon indicating copy to clipboard operation
pingouin copied to clipboard

How to handle categorical covariates with more than two levels in pg.partial_corr?

Open JohannesWiesner opened this issue 1 year ago • 1 comments

Hi, my data includes subjects from 3 different studies. The study is currently encoded as object type (i.e. study = ['study_1','study_2','study_1',etc]. I would like to check the correlation between two variables of interest while controlling for batch effects (by providing covar=['study']). Is it possible to do this with pg.partial_corr? I've found that I can't provide the variable study as an object type (which would be nice by the way), which probably means I need to include it as a dummy coded variable? Because if I don't do that, a simple mapping of the study name to an integer (0,1,2) would be misleadingly interpreted by pingouin as a continuous covariate?

JohannesWiesner avatar Apr 17 '24 09:04 JohannesWiesner

Hi @JohannesWiesner,

Yes, the correct solution here is indeed to use dummy coding (omitting one of the level). You may find this useful: https://stats.oarc.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/

Thanks Raphael

raphaelvallat avatar Apr 20 '24 20:04 raphaelvallat