scikit-tree
scikit-tree copied to clipboard
Post Submission tasks
- [ ] Add MGC and Adaptive Hsic (https://projecteuclid.org/journals/annals-of-statistics/volume-50/issue-2/Adaptive-test-of-independence-based-on-HSIC-measures/10.1214/21-AOS2129.short), and http://dx.doi.org/10.1093/biomet/asz024, as well as (http://dx.doi.org/10.1093/biomet/asz024)
- [ ] Computational complexity figure supplement
- [ ] Marron and Wald 1992 simulations for MIGHT, https://github.com/neurodata/mendseqs/issues/9
- [ ] MVN simulations for Co-MIGHT
- [ ] Fix MIGHT to subsample data per tree
- [ ] Fix MIGHT test to use coleman method (but randomizing permutations per tree)
- [ ] Add script for pulling data from public repository for real data analysis
- [ ] Prove that a MIGHT statistic (e.g., S@98) from Variable Set 1 can be shown to be significantly different from the same statistic from Variable Set 2, even though the dimension of Variable Set 1 is far different from the number of dimensions in Variable Set 2 (need by the time we receive reviews from Science).
- [x] Run dimension power curves for Figure 1 and Supplement for smaller sample size
The general MVN approach maybe can be done as Jovo suggested (w/ some open questions):
X_i | Y ~ MVN, where for CoMIGHT, we generate two such instances that are either directly dependent or not.
Y = mixture of MVN Gaussians, so the MI terms is then: $I(X1, X2; Y) = H(X1, X2) - H(X1, X2 | Y) = H(X1 | X2) + H(X2) - H(X1 | X2, Y) + H(X2 | Y)$
where the non-trivial parts to currently compute are:
- H(X1 | X2) is unsure how to compute analytically, unless we numerically integrate?...
- H(X1 | X2, Y) is the same
Maybe we generate a huge MVN first where we know the $\Sigma_{X1, X2}$ for the subset of variables we denote X1, X2, which is still MVN, and therefore we know H(X1, X2). Then, we use Y as the mixture of Gaussians w/ varying mixture probability?
Structuring the covariance in blocks as such and then using $Y \in [1, 2]$ to select the corresponding multivariate normal should allow us to:
- arbitrarily apply feature-wise transformations for a specific class -> then there is a functional relationship between $X$ and $Y$.
- compute analytical CMI and MI cuz we would have analytical solution for $H(X)$ and $H(X | Y) = H(X | Y=1) + H(X | Y=2) = H(X^{(1)}) + H(X^{(2)})$
Now I'm not 100% sure how to fit this in w/ the Marron/Wald
and