Jeff Hammerbacher
Jeff Hammerbacher
We have two mentions of CuPy in our docs but to the best of my knowledge we don't use CuPy yet. Should we remove those mentions until we make use...
We've got two nice use cases of population and statistical genetics; a third use case that could attract users and contributors could be polygenic risk score computations. I'm only glancingly...
https://github.com/statgenetics/statgen-courses from Columbia has several notebooks that we may want to consider porting to `sgkit`.
Filing for a future in which #279 is solved: https://choishingwan.github.io/PRS-Tutorial as pointed out by @tomwhite provides code to accompany [Tutorial: a guide to performing polygenic risk score analyses](https://pubmed.ncbi.nlm.nih.gov/32709988/) (2020). It...
@eric-czech mentioned on a [recent developer call](https://github.com/pystatgen/sgkit/discussions/553) that we use Numba rather than CuPy to target GPUs because CuPy does not have masked array support https://github.com/cupy/cupy/issues/2225.
Chris Chang recently published a [protocol](https://link.springer.com/protocol/10.1007/978-1-0716-0199-0_3) for PLINK with instructions on how to perform common data management operations. It might be useful to bring this protocol over to our documentation...
Given the presence of wheels for all 3 of our upstream IO libraries, I think it makes sense to favor convenience now and have `pip install sgkit` pull in the...
- [VCF 4.2 spec](https://samtools.github.io/hts-specs/VCFv4.2.pdf) - Example VCF file: https://storage.googleapis.com/hail-tutorial/1kg.vcf.bgz - [cyvcf2.pyx](https://github.com/brentp/cyvcf2/blob/master/cyvcf2/cyvcf2.pyx) - Header types: 'CONTIG', 'FILTER', 'FORMAT', 'GENERIC', 'INFO' - [vcf_reader.py](https://github.com/pystatgen/sgkit/blob/master/sgkit/io/vcf/vcf_reader.py) ### ##INFO - These fields are (usually?) per variant...
For a variant `v` it's nice to be able to say `v.FORMAT` and to see the available fields names. Would it be difficult to do the same for `v.INFO`?