MendelIHT.jl
MendelIHT.jl copied to clipboard
Roadmap for future development
Major Features
- Extend IHT to Cox model.
- Re-enable GPU computing as once provided by @klkeys.
- Add option to possibly constrain IHT estimates to specified upper and lower bounds
- Provide
rowmask
andcolmask
keyword telling IHT which samples/columns it is allowed to use - Add routines to grab biologically relevant information as input arguments for sparse group projections and knowledge aided projections, as previously investigated by @gdmosher.
- Consider regularizing the covariance matrix for multivariate IHT
Minor Features:
- Documentation and unit testing (these will always be a priority).
- Add routine to handle interaction terms (SNP-SNP or SNP-environment)
- Test and validate GLM code for Gamma, Inverse Gaussian, and Binomial regression
- Develop our own
fit
function for debiasing step. Currently we rely on thefit
function in theGLM.jl
package which can sometimes suffer unexpected crashes. One solution is to use theglm_regress
function in MendelBase.jl - Add routines to internally compute, and then use, the top principal components to account for sub-population structure
- Add option in wrapper functions to possibly work on
Float32
matrices instead of always defaulting toFloat64
I might be able to help with the Cox model. I'm currently working on a proximal gradient algorithm w/L1 penalization with it.
After 42f6d2d, IHT now (basically) works on windows machine. Cross-validation cannot automatically delete intermediate files for windows users, but I think that's a weird IO error that Julia base has to deal with....
After https://github.com/OpenMendel/SnpArrays.jl/pull/57 and https://github.com/OpenMendel/SnpArrays.jl/pull/61, it became possible to
- Run linear algebra directly with a
SnpArray
(no need to convert toSnpBitMatrix
) - Run linear algebra on the GPU for Julia 1.4+!
A^tx
is~30x faster thanSnpBitMatrix
type.
Linear algebra on SnpBitMatrix
is also 3-4x faster than before. Will clean up linear algebra code and add these support.
For binary PLINK files, gradient computations on CPU are parallelized since MendelIHT v1.4.x for both univariate and multivariate IHT. This is achieved internally by SnpLinAlg
type of SnpArrays