recipes icon indicating copy to clipboard operation
recipes copied to clipboard

remove highly correlated categorical variables

Open zhaoliang0302 opened this issue 2 years ago • 2 comments

Hi,

step_corr() can remove highly correlated continuous variables using Pearson or Spearman correlation analysis. However, prefilter functions for categorical variables were not provided in the recipes package. I have 20 columns with categorical variables (using one-hot encoding), and I want to remove redundant columns which were correlated with each other. Can you give me some advice? Thanks

Best regards

zhaoliang0302 avatar Oct 11 '22 03:10 zhaoliang0302

Hello @zhaoliang0302, I have been thinking about such steps for a while, do you know of any existing methods that would work to do such an opperation?

EmilHvitfeldt avatar Mar 30 '23 21:03 EmilHvitfeldt

Hi all, i just came across this issue. I reviewed the JOSS submission for {latentcor}, which is on CRAN and might provide a versatile solution for logical, numeric, and categorical variables.

corybrunson avatar Mar 31 '24 13:03 corybrunson