VariantSpark icon indicating copy to clipboard operation
VariantSpark copied to clipboard

machine learning for genomic variants

Results 76 VariantSpark issues
Sort by recently updated
recently updated
newest added

When using VariantSpark interface for Hail to run important analysis, VariantSpark expect to have exactly one allele in REF and ALT field. If there was any issue (some dataset have...

bug

An option to request calculation for FDR (False Discovery Rate) for importance on permutated labels. (It should be possible to build the permutated forests in parallel)

Create an automated way to spin up a Spark cluster on AWS with VariantSpark installed so researchers can get started easily. This extends the work done by Lynn Langit as...

enhancement

Implement a readable output e.g. JSON for the trained trees. This will help evaluate the resulting interactions and visualise them.

enhancement

Hi, I just started to try out VariantSpark with Scala using its featureImportance function on my own vcf file. Which worked perfectly as I followed the example code from the...

help wanted

Add support for continuous variables for trees.

Add functionality to generate variable importances for a selected number of random permutations of labels. Output variable variable importance for each permutation.

Could VariantSpark be used for multiple labels, instead of just binary? Thanks

enhancement

Add functionality to simulate sequencing errors to a phase and unphased genotypes (vcf files). Types of errors and other parameters to be defined.