create_cisTarget_databases icon indicating copy to clipboard operation
create_cisTarget_databases copied to clipboard

Can the input motifs be in the PWM format?

Open kerenzhou062 opened this issue 2 years ago • 3 comments

Hi, can the input motifs be in the PWM format? Like outputs from HOMER program (motif1.motif), an example please check bellow:

>TGCATG	1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647)	5.179177	-34261.033795	0	T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001	0.044	0.001	0.954
0.001	0.001	0.997	0.001
0.001	0.997	0.001	0.001
0.997	0.001	0.001	0.001
0.001	0.001	0.001	0.997
0.001	0.001	0.997	0.001

Best,

Keren

kerenzhou062 avatar Mar 20 '22 04:03 kerenzhou062

The motifs need to be in Cluster-Buster format.

The following function will create them (put one homer motif per file).

homer_to_clusterbuster () {
    local homer_motif_file="${1}";
    awk -F '\t' -v 'OFS=\t' '{ if ($1 ~ />/) { print $1 } else if (NF == 4) { print $1 * 100, $2 * 100, $3 * 100, $4 * 100; } }' "${homer_motif_file}";
}
$ cat /tmp/motif.homer 
>TGCATG	1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647)	5.179177	-34261.033795	0	T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001	0.044	0.001	0.954
0.001	0.001	0.997	0.001
0.001	0.997	0.001	0.001
0.997	0.001	0.001	0.001
0.001	0.001	0.001	0.997
0.001	0.001	0.997	0.001

$ homer_to_clusterbuster /tmp/motif.homer 
>TGCATG
0.1	4.4	0.1	95.4
0.1	0.1	99.7	0.1
0.1	99.7	0.1	0.1
99.7	0.1	0.1	0.1
0.1	0.1	0.1	99.7
0.1	0.1	99.7	0.1

ghuls avatar Mar 21 '22 12:03 ghuls

The motifs need to be in Cluster-Buster format.

The following function will create them (put one homer motif per file).

homer_to_clusterbuster () {
    local homer_motif_file="${1}";
    awk -F '\t' -v 'OFS=\t' '{ if ($1 ~ />/) { print $1 } else if (NF == 4) { print $1 * 100, $2 * 100, $3 * 100, $4 * 100; } }' "${homer_motif_file}";
}
$ cat /tmp/motif.homer 
>TGCATG	1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647)	5.179177	-34261.033795	0	T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001	0.044	0.001	0.954
0.001	0.001	0.997	0.001
0.001	0.997	0.001	0.001
0.997	0.001	0.001	0.001
0.001	0.001	0.001	0.997
0.001	0.001	0.997	0.001

$ homer_to_clusterbuster /tmp/motif.homer 
>TGCATG
0.1	4.4	0.1	95.4
0.1	0.1	99.7	0.1
0.1	99.7	0.1	0.1
99.7	0.1	0.1	0.1
0.1	0.1	0.1	99.7
0.1	0.1	99.7	0.1

Thank you for your explaination!

Best,

Keren

kerenzhou062 avatar Mar 21 '22 16:03 kerenzhou062

Our SCENIC+ public motif collection is now available: https://resources.aertslab.org/cistarget/motif_collections/

ghuls avatar Apr 18 '23 15:04 ghuls