create_cisTarget_databases icon indicating copy to clipboard operation
create_cisTarget_databases copied to clipboard

Can I apply cisTarget to estimate the RBP-regulon?

Open kerenzhou062 opened this issue 3 years ago • 6 comments

Hi,

I have binding sites of RNA binding proteins (RBP) analyzed from CLIP-seq data, so I want to do the motif enrichment and RBP-regulon prediction, which's used as input for SCENIC. How can I do this? BTW, I also analyzed the motif enrichment by using HOMER, which motif format was PWM like bellow;

>TGCATG	1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647)	5.179177	-34261.033795	0	T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001	0.044	0.001	0.954
0.001	0.001	0.997	0.001
0.001	0.997	0.001	0.001
0.997	0.001	0.001	0.001
0.001	0.001	0.001	0.997
0.001	0.001	0.997	0.001

Best, Keren

kerenzhou062 avatar Mar 20 '22 04:03 kerenzhou062

SCENIC is designed for doing motif enrichment of proteins that bind to DNA (transcription factors), so using it for RBPs probably does not make much sense.

ghuls avatar Mar 21 '22 12:03 ghuls

SCENIC is designed for doing motif enrichment of proteins that bind to DNA (transcription factors), so using it for RBPs probably does not make much sense.

Thank you for your information!

Though SCENIC is designed for TFs, in my opinion, TFs and RBPs are quite similar in regulating their targets. RBPs also exert their functions by recognizing targets via motifs. Could you please explain more about why do you think that SCENIC is not suitable for RBPs?

Best,

Keren

kerenzhou062 avatar Mar 21 '22 16:03 kerenzhou062

You will need to change at least the the first step of pySCENIC (gene regulatory network) with something that makes sens for RBPs.

Probably you are aware, but if not, CISBP-RNA has a number of RBPs: http://cisbp-rna.ccbr.utoronto.ca/ You will need to rescale those motifs to count matrices of 100.

ghuls avatar Mar 22 '22 09:03 ghuls

You will need to change at least the the first step of pySCENIC (gene regulatory network) with something that makes sens for RBPs.

Probably you are aware, but if not, CISBP-RNA has a number of RBPs: http://cisbp-rna.ccbr.utoronto.ca/ You will need to rescale those motifs to count matrices of 100.

Thank you for your suggestions!

Do you mean that I need to filter the GRNs with a more reasonable cutoff for the Pearson product moment correlation (default ρ ≥ +0.03 for positive and ρ ≤ −0.03 for negative) )? As known to all, RBPs usually directly binds to their targets and influence their stability, so it may be acceptable to run pySCENIC with a more stringent cutoff, like 0.1 or higher?

Best,

Keren

kerenzhou062 avatar Mar 22 '22 16:03 kerenzhou062

If you know which RBPs bind to which targets, you won't need the GRN related code of pySCENIC as there the TF to target gene relation is inferred and not based on known TF to target gene relations.

Playing the the cutoff will be likely necessary.

ghuls avatar Mar 24 '22 08:03 ghuls

If you know which RBPs bind to which targets, you won't need the GRN related code of pySCENIC as there the TF to target gene relation is inferred and not based on known TF to target gene relations.

Playing the the cutoff will be likely necessary.

Yeah, we actually can get the RBP-target relationships, but it's really hard to rank them, which's required for Module Generation (Step 6). Also, like TF, RBP can repress their targets which is hard to be reflected from binding information. So, in my opinion, the construction of GRNs is still necessary.

To improve the prediction accuracy, filtering the GRNs by RBP-target relationships before Module Generation step may be a good strategy?

Best,

Keren

kerenzhou062 avatar Mar 24 '22 16:03 kerenzhou062