grenedalf
grenedalf copied to clipboard
How to understand the statistics of the screen output?
Dear @lczech,
I have four pooling sequencing libraries. I have finished SNP calling using bwa+GATK and a total of 9.8 M SNPs (after filtering for depth and missing rate) have been identified. I found that grenedalf
can directly work with BAM and VCF format. So I first calculated the FST using BAM as the input. At the end of the programme, the following statistics were recorded. How to understand each statistic and which number better represents the number of SNPs? If 28568428
represents the number of SNPs, it is significantly different from 9.8 M.
Sample filter summary (summed up across all samples):
Passed: 4036079496
Empty (after counts): 32041114
Above max coverage: 263818
Total filter summary (after applying all sample filters):
Passed: 28568428
Below min coverage: 51188538
Not SNP: 937243195
Finished 2023-10-02 23:48:24
When using VCF as the input, the statistics looks much more normal. It seems like that 9795814
represents the number if SNPs after filtering by minor allele count > 0.
Processed 39 chromosomes with 9795814 (non-filtered) positions in 47245 windows.
Total filter summary (after applying all sample filters):
Passed: 9795814
Not SNP: 120324
Finished 2023-10-03 10:29:20
Sincerely, Zhuqing Zheng