scikit-tree icon indicating copy to clipboard operation
scikit-tree copied to clipboard

Post Submission tasks

Open sampan501 opened this issue 1 year ago • 2 comments

  • [ ] Add MGC and Adaptive Hsic (https://projecteuclid.org/journals/annals-of-statistics/volume-50/issue-2/Adaptive-test-of-independence-based-on-HSIC-measures/10.1214/21-AOS2129.short), and http://dx.doi.org/10.1093/biomet/asz024, as well as (http://dx.doi.org/10.1093/biomet/asz024)
  • [ ] Computational complexity figure supplement
  • [ ] Marron and Wald 1992 simulations for MIGHT, https://github.com/neurodata/mendseqs/issues/9
  • [ ] MVN simulations for Co-MIGHT
  • [ ] Fix MIGHT to subsample data per tree
  • [ ] Fix MIGHT test to use coleman method (but randomizing permutations per tree)
  • [ ] Add script for pulling data from public repository for real data analysis
  • [ ] Prove that a MIGHT statistic (e.g., S@98) from Variable Set 1 can be shown to be significantly different from the same statistic from Variable Set 2, even though the dimension of Variable Set 1 is far different from the number of dimensions in Variable Set 2 (need by the time we receive reviews from Science).
  • [x] Run dimension power curves for Figure 1 and Supplement for smaller sample size

sampan501 avatar Jan 07 '24 22:01 sampan501

The general MVN approach maybe can be done as Jovo suggested (w/ some open questions):

X_i | Y ~ MVN, where for CoMIGHT, we generate two such instances that are either directly dependent or not.

Y = mixture of MVN Gaussians, so the MI terms is then: $I(X1, X2; Y) = H(X1, X2) - H(X1, X2 | Y) = H(X1 | X2) + H(X2) - H(X1 | X2, Y) + H(X2 | Y)$

where the non-trivial parts to currently compute are:

  • H(X1 | X2) is unsure how to compute analytically, unless we numerically integrate?...
  • H(X1 | X2, Y) is the same

Maybe we generate a huge MVN first where we know the $\Sigma_{X1, X2}$ for the subset of variables we denote X1, X2, which is still MVN, and therefore we know H(X1, X2). Then, we use Y as the mixture of Gaussians w/ varying mixture probability?

adam2392 avatar Jan 08 '24 15:01 adam2392

Structuring the covariance in blocks as such and then using $Y \in [1, 2]$ to select the corresponding multivariate normal should allow us to:

  1. arbitrarily apply feature-wise transformations for a specific class -> then there is a functional relationship between $X$ and $Y$.
  2. compute analytical CMI and MI cuz we would have analytical solution for $H(X)$ and $H(X | Y) = H(X | Y=1) + H(X | Y=2) = H(X^{(1)}) + H(X^{(2)})$

Now I'm not 100% sure how to fit this in w/ the Marron/Wald

IMG_2247

and

IMG_2248

adam2392 avatar Jan 09 '24 21:01 adam2392