F4-statistics from unlinked SNPs of SNP array
I have a question associated with the usage of this tool, can this tool be used tool to calculate F4-statistics on my data that consists of unlinked SNPs from SNP array? Will the simulation by fastsimcoal2 (as run by F4.py) affected by it?
Hi, I guess with SNP array data, D- or F4-statistics could be influence by how the SNPs were originally selected for the array. I assume this was not done randomly, but with variability in the species/population in mind? Besides that, yes, F4 can calculate the F4 statistic from such data. But the statistic itself shouldn't be any different if you calculate it with a tool like Dsuite, and the latter would be much faster. The F4 tool might only report a different p-value because this is what the simulations are used for. In your case, I would probably just run Dsuite now and if you worry that the p-value might be affected by the jackknifing method, then you could also run the F4 tool.
Thank you for the prompt response even on sunday (I really appreciate it). Apart from SNP array, I also have WGS-SNPs, therefore, there, I could use random SNPs situated at relatively distant places along the genome. And yes, Indeed, I was worried about how using different blocks of jackknifing affect the Z-values, that is why I wanted to use this tool! My only worry was whether the simulation parameters as carried out by fastsimcaol2 (in F4.py) is specific to RAD-seq (like generated SNPs in block).
With WGS-SNPs, I probably would not worry too much about linkage in the calculation of D or F4. You could of course try with or without thinning the dataset, but I wouldn't expect much difference (unless a gigantic inversion region has a large influence or similar). But the F4 tool is definitely applicable to data other than RAD-seq data. In that sense, SNP array data should be fine.
Okay, thank you again for the prompt response! I will try this tool as well as D-suite tool on my WGS as well as as SNP array data.