CaDrA icon indicating copy to clipboard operation
CaDrA copied to clipboard

Signed Mutual Information

Open tetomonti opened this issue 1 year ago • 0 comments

[Note: I'm adding the content of my email here for record keeping]

The reason revealer returns a signed MI is because it multiplies the actual MI by the sign of the features’ correlation. In the code, you will see that cond_mutual_inf has the step (line 202):

CIC <- sign(rho) * sqrt(1 - exp(-2 * CMI))`

And and the mutual_inf_v2 function has the step (line 248)

IC <- sign(rho) * sqrt(1 - exp(-2 * MI))`

Which basically multiplies the MI by the sign of the correlation (rho) between the two variables.

I think we can do the same in our knnmi-based score. In order not to lose efficiency, we could call the cor function on the entire set of features. i.e., when computing the MI between X and all the remaining features, say, REST, do something like

MI <- knnmi(X,REST,Z)
RHO <- cor(X,REST)
SMI <- MI * sign(RHO)

tetomonti avatar Aug 29 '23 15:08 tetomonti