TOBIAS
TOBIAS copied to clipboard
Footprinting for multiple tissues.
Thanks for your TOBIAS.
Now I have multiple tissues (>15), I want to compare them with footprinting. Is TOBIAS capable of doing such an analysis? Can you give me some advice?
TOBIAS is capable yes, but you are going to need a bit of computational power (mostly RAM actually) to do that. This is because there are parts of the pipeline where the scores of each tissue needs to be kept in memory, so more conditions -> more memory needed.
Firstly, if you didn't already, I will recommend that you setup the TOBIAS snakemake pipeline for automatic calculation of peaks, correction of tn5 bias etc. The link is here: https://github.molgen.mpg.de/loosolab/TOBIAS_snakemake. Plotting all footprints for all conditions is going to take a while, but the initial calculation of correction-footprinting-comparison is made easier by this pipeline.
Secondly, you might have a look at the --time-series
option of TOBIAS BINDetect
. Normally, the analysis will do all-against-all comparisons of conditions, but since you have >15 tissues, you are going to get >100 combinations! That is probably going to take forever. Instead, you can set --time-series
such that the comparisons are only made between tissue1-tissue2, tissue2-tissue3 etc. You lose the individual comparisons, but these can be easily estimated post-run.
I hope any of those tips will help you out!
Thank you for your patience in answering my questions. I want to plot a similar figure as follows. But I do not know how to construct the data frame with TOBIAS result. Or is it possible to construct such a data frame to draw this using the output of TOBIAS?
Hi, I am not exactly sure what this figure shows - probably motif enrichment? It is not possible to create that exact plot with TOBIAS, but the output of TOBIAS will give you a n_transcription_factors
x n_conditions
table of footprint scores, which might be useful for creating a similar plot. Then you can load the table with python or R, and create the plot colored by the footprint scores.
Thanks for your reply. Do you mean that the sample_footprints_mean_score of each sample is extracted in bindetect_results.txt? I took this information to construct a matrix of n_TF x n_footprint_score, but the difference between these scores is too small to compare the differences between different tissues.
Yes, I mean bindetect_results.txt. In order to compare the scores, you might have to normalize each row with e.g. Z-score. Otherwise, you will have a bias in the TFs with generally high footprint scores, and those with generally low footprint scores. So maybe that is why the scores seem so low.
But since the statistics behind TOBIAS are different, I cannot promise that it will turn out the same way as the original plot.