basenji
basenji copied to clipboard
Why the link for Test gene expression predictions does not work?
Hi there,
I'm interested in predicting gene expression using Basenji and found that the link for "Test gene expression predictions" in the tutorial does not work. I saw that you answered the same question several days ago and suggested using basenji_predict_bed.py. I still don't understand how to get the gene expression and why we should choose 256 bp as the input length. Would you please explain a bit more?
I appreciate your help.
Leah
The Basenji framework is generally gene-agnostic, so producing gene expression predictions requires requesting predictions from the model for specific sites that you know correspond to genes of interest. basenji_predict_bed.py allows you to make predictions for any region described by a BED file. So if you have some genes of interest, you can choose their TSS and make a BED file. Because the output bins are 128 bp and the sequence lengths are even numbers, the best you can do is place the TSS in between the two center bins and request the predictions for the sum of those two bins. That covers 256 bp, which I think ought to be a good range since most TSS annotations are a little bit imprecise.