xpclr icon indicating copy to clipboard operation
xpclr copied to clipboard

Specify value used to define missing genotypes.

Open JunpengShi opened this issue 6 years ago • 5 comments

Dear Nick,

I'm wondering that if your XPCLR could deal with missing genotypes?

Previous XPCLR (Chen, et al., 2010.) format the missing genotypes with 9.

I used the same file format to run your XPCLR, and I found that overwhelming majority of SNPs were reported to be multiallelic,possibily due to the missing genotypes?

2019-06-03 09:38:15 : INFO : running xpclr v1.1.1 2019-06-03 09:38:15 : INFO : Loading TXT 2019-06-03 09:39:38 : INFO : TXT loading complete 2019-06-03 09:39:38 : INFO : 1,214,768 SNPs in total are in the provided input files 2019-06-03 09:39:39 : INFO : 1,212,857 SNPs excluded as multiallelic 2019-06-03 09:39:39 : INFO : 0 SNPs excluded as missing in all samples in a population 2019-06-03 09:39:39 : INFO : 605 SNPs excluded as invariant or singleton in population 2 2019-06-03 09:39:39 : INFO : 1,306/1,214,768 SNPs included in the analysis (0.11%) 2019-06-03 09:39:40 : INFO : Done dropping above SNPs from analysis. XP-CLR algorithm starting. 2019-06-03 09:39:40 : INFO : Omega estimated as : 0.236594 2019-06-03 09:40:12 : INFO : Analysis complete. Output file ./chr7.17parviglumis_23landraces

Sincerely, Junpeng

JunpengShi avatar Jun 03 '19 01:06 JunpengShi

My genotype file looks like this: 0 0 9 9 1 1 9 9 0 0 0 0 9 9 9 9 9 9 0 0 0 0 9 9 0 0 9 9 0 0 1 1 0 0 0 0 9 9 0 0 9 9 0 0 0 0 0 0 9 9 9 9 0 0 0 0 0 0 0 0 9 9 0 0 0 0 0 0 0 0 9 9 0 0 9 9 0 0 9 9 0 0 0 0 0 0 0 0 0 0 9 9 0 0 9 9 0 0 0 0 0 0 0 0 0 0 9 9 9 9 0 0 9 9 0 0 9 9 0 0 0 0 0 0 0 0 9 9 9 9 0 0 9 9 0 0 0 0 9 9 0 0 9 9 0 0 9 9 0 0 9 9 0 0 0 0 0 0 0 0 9 9 0 0 0 0 9 9 0 0 0 0 9 9 9 9 9 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0 0 0 9 9 0 0 0 0 9 9 0 0 0 0 9 9 9 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 9 0 0 0 0 0 0 0 0 1 1 9 9 0 0 9 9 0 0 9 9 1 1 1 1 0 0 0 0 1 1 1 1 9 9 1 1 1 1 0 0 1 1 1 1 9 9 9 9 0 0 0 0 1 1 1 1 1 1 9 9 9 9 1 1 1 1 9 9 1 1 1 1 1 1 1 1 1 1 9 9 9 9 0 0 9 9 0 0 1 1 1 1 0 0 0 0 1 1 1 1 9 9 1 1 1 1 9 9 1 1

JunpengShi avatar Jun 03 '19 01:06 JunpengShi

Yes. Missing genotypes should be encoded as -1, following the convention of VCF.

hardingnj avatar Jun 03 '19 16:06 hardingnj

Thanks Nick. It works now!

JunpengShi avatar Jun 05 '19 01:06 JunpengShi

Leaving this open as a note to document this more fully. Generally I want to encourage users to give VCF or zarr as inputs though.

hardingnj avatar Jun 20 '19 08:06 hardingnj

dear hardingnj,when i runing the xpclr ,the wrong information to me :ValueError: arange: cannot compute length,how can i deal with this wrong.

kizzhengwangshan avatar Dec 10 '19 03:12 kizzhengwangshan