hyppo icon indicating copy to clipboard operation
hyppo copied to clipboard

Kernel Conditional Independence Tests

Open rflperry opened this issue 3 years ago • 5 comments

Testing for conditional independence, X \indep Y | Z, is a common problem within causal discovery and feature selection. The following two kernel-based methods are able to perform this test without too many assumptions.

Kernel Conditional Independence (KCI) Test [paper][matlab code]

  • Well known and commonly used in practice from my understanding.
  • Computes kernel matrices in each of the variables X, Y, Z to compute a test statistic.
  • Approximates the null distribution using a Gamma distribution. No permutation test available.
  • [Edit] Python code in this package

A Permutation-Based Kernel Conditional Independence (KCIP) Test [paper][matlab code]

  • Potentially an improvement to KCI but not as widely used or known, partially due to speed constraints.
  • Computes kernel matrices in each of the variables X, Y, Z.
  • Also provides a two-layer bootstrap permutation test by:
    • Finding a permutation Y' of Y based on minimizing the permuted Z distances.
    • Performing a two-sample test (MMD) on the original (X, Y, X) and permuted (X, Y', Z)
  • Improves upon KCI when it's null is not well specified (compelx, higher-dimension Z), or if Z can be clustered well or is discrete.
  • Also provides analytic approximates the null distribution using a Gamma distribution.

A nonparametric test based on regression error (FIT) [paper] [python code]

  • A bit more fringe than KCI/KCIP but provides good simulation comparisons between all three methods plus more.
  • Uses a nonparametric regression (in their case, a decision tree) to examine the change in predictive power based on including versus excluding some variables Z.
  • Uses the mean squared error as a test statistic and an analytic Gaussian/T-test approach to compute a pvalue
  • Seemingly efficient for large samples sizes as compared to other kernel based approaches.
  • Interesting connections in that trees/forests are adaptive kernel methods and extensions to forests/honesty/leaf permutations.

rflperry avatar Oct 26 '21 09:10 rflperry

Interested

zdbzdb123123 avatar Feb 03 '22 17:02 zdbzdb123123

@zdbzdb123123 which one? Once you have decided, please make a new issue with the description and link to this issue

sampan501 avatar Feb 03 '22 17:02 sampan501

KCI, and will do

zdbzdb123123 avatar Feb 03 '22 18:02 zdbzdb123123

I also discovered a package with python code, matlab wrappers.

  1. KCI code
  2. KCIP code The package has some other things, including a small notebook with simulations to test the tools.

rflperry avatar Feb 04 '22 10:02 rflperry

Interested in FIT

MatthewZhao26 avatar Feb 09 '22 05:02 MatthewZhao26