TOBIAS icon indicating copy to clipboard operation
TOBIAS copied to clipboard

TF's below footprint threshold are in bound bedfile and TF's that shouldn't be in unbound file are in there?

Open nessj216 opened this issue 1 year ago • 3 comments

Hello, I understand that found motifs that have a p-value smaller than threshold p-value make it to the TF list, where they are assigned their footprint score. (TF motifs with too high a p-value never get a footprint score, right?). Then TF motifs that have a high enough footprint score make it the bound.bed file, while the TF motifs that have a footprint score below the threshold are in the unbound.bed file. BUT I have found a potential issue if my understanding above is correct. I am getting too low of footprint scores in my bound.bed file (that fall below the threshold value established). Conversely, I am getting motifs with high enough footprint scores that remain in the unbound bed file. For example here you can see the 'bound', 'unbound' and 'all' bed files in IGV, where the footprint threshold from Tobias (for this particular ATAC replicate) is calculated to be ~3.5. BUT all the motifs--even though they their scores is above that threshold--are still on the unbound list.

Screen Shot 2022-07-30 at 3 12 30 PM

(P.S. I am NOT doing differential analysis of TF footprints between two conditions. I am only doing one condition. The input to BINDetect is 1)the footprint file made from ScoreBigwig, 2)the SAME peak bed file I used in all the other steps!, 3) my PWM file, 4) the genome file (drosophila in this case).

any help would be greatly appreciated!!!! thank you

nessj216 avatar Jul 30 '22 19:07 nessj216

here is the converse--where TFs with too low footprints are in the bound file Screen Shot 2022-08-02 at 11 17 22 AM

nessj216 avatar Aug 02 '22 15:08 nessj216

Hi @nessj216 ,

The score which is shown in IGV is the motif score (5th column of the .bed), which is the score of the TF motif to the genome sequence - and this is independent of footprinting. The footprint score is found in the column named <condition>_score in the bindetect_output txt-file. There is also an overview of the output columns here. I hope this clears up your question!

BR Mette

msbentsen avatar Aug 03 '22 08:08 msbentsen

thank you a million!! sorry about that!! this is very helpful

nessj216 avatar Aug 03 '22 16:08 nessj216