TOBIAS icon indicating copy to clipboard operation
TOBIAS copied to clipboard

Question Regarding the Basis of Motif Logo Creation

Open Myrtle-bio opened this issue 5 months ago • 1 comments

Hi, thanks for the remarkable job!

I am currently exploring the creation of motif logos in the context of sequence motif analysis and have encountered some uncertainties.

My initial understanding was that motif logos are typically generated based on the information content of a sequence probability matrix (SPM). This seems to be the case for a subset of motifs I have worked with, where the motif logos align well with the SPM information content. However, I've noticed that for several other motifs, their corresponding logos significantly differ from what I would expect based on their SPM information content.

To better illustrate my query, I would like to present a specific example that highlights these differences:

Here is the motif logo generated by the bindetect process: image

Based on this image, the motif appears somewhat unreliable or ambiguous. However, when I use the SPM of this same motif and plot a logo based on its information content, I obtain a noticeably different representation: 046962902cd76ad70b3c53a51d1d483

Itried to search this issue using keywords like "logo" in the GitHub repository and your papers but have not found relevant information that explains this discrepancy.

Could you please help me understand the following: Does this difference in motif representations signify an underlying issue or a specific aspect of the analysis that I might be overlooking? If there are potential errors or misunderstandings on my part, could you point them out or guide me to resources that might clarify this situation?

Thank you very much for your time and support.

Myrtle-bio avatar Jan 30 '24 15:01 Myrtle-bio