snps
snps copied to clipboard
tools for reading, writing, merging, and remapping SNPs
Add ability to compute additional summary statistics: * Percentage NaN (autosomal, X, Y, MT) * Percentage homozygous and heterozygous (autosomal, X) * GC content (autosomal, X, Y, MT) Related to...
Per the Ensembl README (ftp://ftp.ensembl.org/pub/grch37/current/fasta/homo_sapiens/dna/README): > Human has sequenced Y chromosomes and the pseudoautosomal region (PAR) on the Y is annotated. By definition the PAR region is identical on the...
Some newer [Family Tree DNA](https://www.familytreedna.com) famfinder files seem to have SNPs assigned to chromosome 0. Similar to `SNPs._assign_par_snps`, use the [RefSNP](https://api.ncbi.nlm.nih.gov/variation/v0/) API to assign each of these SNPs to a...
Utilize NCBI's [Variation Services](https://api.ncbi.nlm.nih.gov/variation/v0/) API to populate missing RSIDs. An idea for how this could be performed... For all SNPs in a dataset, lookup RSIDs in batches of 50k SNPs...
Currently, `snps` normalizes data into a dataframe with four columns named `rsid`, `chrom`, `pos`, and `genotype`. `genotype` can either be `np.nan` or a string of length 1 or 2. For...
Hi, I ran into this issue when I am reading vcf file: `ValueError: invalid literal for int() with base 10: '48169282.0': Error while type casting for column 'pos'` Version: snps...