bigsnpr icon indicating copy to clipboard operation
bigsnpr copied to clipboard

Ldpred2-auto subset SNPs or genomewide fit?

Open szhan1000 opened this issue 1 year ago • 2 comments

Dear Florian,

I have a very general question about LDpred2-auto. I understand you have recommended to run LDpred2-auto on all SNPs genome-wide. But I wonder if I can run it by SNP subsets that have different proportions of causal variants (e.g., as indicated by different folds of heritability enrichment from stratified LDSC ). Since each subset SNP could have different proportion of causal SNPs, I thought LDpred2-auto could have a better and customized fit for each subset? After I run each subset, I will then combine estimated coefficients and compute the genomewide PRS.

I would appreciate if you can provide some further advice.

Best, Shizhong

szhan1000 avatar Oct 09 '22 14:10 szhan1000

No, I would not recommend it.

Perhaps, if there is one chromosome (or one region) contributing a lot, you could try running it once for this region, and once genome-wide excluding just this region. I am not sure if this would actually provide better fit.

In future work, I will probably consider allowing for enrichment (e.g. in the prior). But I am working on something else right now.

privefl avatar Oct 10 '22 06:10 privefl

Also, I would not trust enrichments from S-LDSC, I think they are often overestimated.

privefl avatar Oct 10 '22 06:10 privefl

Maybe they are not after all: https://doi.org/10.1101/2022.10.12.510418.

Anyway, I'll probably work on using more variants and functional annotations. It will take a bit of time, so closing for now.

privefl avatar Nov 10 '22 07:11 privefl