LDhat icon indicating copy to clipboard operation
LDhat copied to clipboard

Setting a proper theta in filtered/subsampled SNP datasets

Open melop opened this issue 7 years ago • 2 comments

Dear Adam,

After reading the manual it sounded like a minor mis-specification of theta doesn't affect the estimate of rho that much. But I hope to understand the proper method for setting theta if I have quality-filtered my segregating sites (e.g. based on allele frequency, SNP quality and sequencing coverage). If I use the input file format that only shows segregating sites (format 2), wouldn't that lower the estimate of theta? Should I go ahead with this lowered theta or instead estimate theta by other means and use that to generate a look-up table?

Another related question: do the pre-generated lookup tables in lk_files/ work for genotype datasets?

Best Regards, Ray

melop avatar Oct 09 '17 09:10 melop

Depending on your exact situation, one suggestion would be to estimate recombination results using a range of (reasonable) theta estimates. My expectation is that the results should be qualitatively similar.

Yes - the likelihood files should work with genotype data, although it often better to provide phased data if possible.

On 9 October 2017 at 02:57, Ray [email protected] wrote:

Dear Adam,

After reading the manual it sounded like a minor mis-specification of theta doesn't affect the estimate of rho that much. But I hope to understand the proper method for setting theta if I have quality-filtered my segregating sites (e.g. based on allele frequency, SNP quality and sequencing coverage). If I use the input file format that only shows segregating sites (format 2), wouldn't that lower the estimate of theta? Should I go ahead with this lowered theta or instead estimate theta by other means and use that to generate a look-up table?

Another related question: do the pre-generated lookup tables in lk_files/ work for genotype datasets?

Best Regards, Ray

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/auton1/LDhat/issues/4, or mute the thread https://github.com/notifications/unsubscribe-auth/AElWYzIt5r4ma9Jua8YZSckdgw5bSqxDks5sqe4OgaJpZM4PyPut .

-- Adam Auton

auton1 avatar Oct 09 '17 15:10 auton1

Hi Adam, thank you for your help. I will try with different thetas. We only have ~60 individuals for our fish, which may be too low for phasing with accuracy.

melop avatar Oct 09 '17 15:10 melop