ExPecto icon indicating copy to clipboard operation
ExPecto copied to clipboard

difference of the result between predict.py and website (HumanBase)

Open zofieLin opened this issue 4 years ago • 1 comments

Hi,

I used the script "predict.py" to do the prediction of a vcf file, and I find that there is some difference between these results and the results from ExPecto website. A lot of variants could not found the result on the website with a warning like "No significant predictions for rs188098026 found". However, I can get the result from "predict.py", so I wonder if there is any difference between these two methods? Besides, I downloaded a full file of variation potential prediction of all 140 million mutations (~125G), for each tissue file (e.g. effects_pergene_mat_Whole_Blood.txt), it contains 6003 columns, however, I couldn't find the information nor column name of these columns. Could you please tell me where could I find it? Thanks.

Zofie

zofieLin avatar Jun 19 '20 08:06 zofieLin

The website only contained variants with >0.3 predicted log fold change in at least one tissue while the code provides all predictions regardless of their effect size.

For your second question, every entry of the matrix in the file corresponds to a variant in the effects_coors.txt file (there are 6003 variants per gene). The orders of variants and genes are the same as in the effects_coors.txt file.

Jian

jzthree avatar Jun 20 '20 03:06 jzthree