TOBIAS icon indicating copy to clipboard operation
TOBIAS copied to clipboard

BINDetect not giving out error when the motif file is "deformed"

Open johannesnicolaus opened this issue 6 months ago • 2 comments

Might be a continuation of issue #78. When I tried to run BINDetect using "pfm" motif file created by gimmemotifs, i get a problem where

The pfm file looks something like:

>GM.5.0.Sox.0001
0.7213  0.0793  0.1103  0.0891
0.9259  0.0072  0.0062  0.0607
0.0048  0.9203  0.0077  0.0672
0.9859  0.0030  0.0030  0.0081
0.9778  0.0043  0.0128  0.0051
0.1484  0.0050  0.0168  0.8299
>GM.5.0.Homeodomain.0001
0.8870  0.0000  0.0178  0.0951
0.1156  0.2033  0.6629  0.0181
0.0017  0.7452  0.0809  0.1722
0.0011  0.0003  0.0003  0.9983
0.0026  0.0141  0.9721  0.0111
0.0000  0.0189  0.0054  0.9758
0.0006  0.9983  0.0006  0.0006
0.9170  0.0140  0.0046  0.0644
0.2228  0.2421  0.3300  0.2051
0.3621  0.1054  0.2208  0.3116
0.5727  0.0104  0.1741  0.2428

For example, I have 1796 motifs in the pfm file, but I got the following warning:

2023-12-16 10:23:46 (1569572) [INFO]	Reading motifs from file
2023-12-16 10:23:47 (1569572) [INFO]	- Read 5531 motifs
2023-12-16 10:23:47 (1569572) [WARNING]	The motif output names (as given by --naming) are not unique.
2023-12-16 10:23:47 (1569572) [WARNING]	The following names occur more than once: ['_']
2023-12-16 10:23:47 (1569572) [WARNING]	These motifs will be renamed with '_1', '_2' etc. To prevent this renaming, please make the names of the input --motifs unique

And I got results with the directories named as such:

__1     __1413  __1829  __2243  __2659  __3073  __3489  __541  __957

or

GM.5.0.Sox.0001_GM.5.0.Sox.0001
GM.5.0.Sox.0002_GM.5.0.Sox.0002
GM.5.0.Sox.0003_GM.5.0.Sox.0003
GM.5.0.Sox.0004_GM.5.0.Sox.0004
GM.5.0.Sox.0005_GM.5.0.Sox.0005
GM.5.0.Sox.0006_GM.5.0.Sox.0006
GM.5.0.Sox.0007_GM.5.0.Sox.0007
GM.5.0.Sox.0008_GM.5.0.Sox.0008
GM.5.0.Sox.0009_GM.5.0.Sox.0009

Maybe this pfm file is not a standard pfm file, but maybe it would be nice if BINDetect gives an error that the motif file is not standard.

My current workaround is to run chen2meme, because it seems that it is a chen motif file. Now BINDetect seems to work fine.

johannesnicolaus avatar Dec 16 '23 02:12 johannesnicolaus