pySCENIC icon indicating copy to clipboard operation
pySCENIC copied to clipboard

How to create zebrafish Motif2TF / motif annotation database for pyscenic ctx

Open stanaka6 opened this issue 3 years ago • 0 comments

Hi team,

Thank you for providing a great tool. Could you please explain more details of how to generate a zebrafish motif annotation database (Motif2TF format)? Or even it would be very helpful to provide a sample script for that purpose.

I want to run pyscenic ctx using my zebrafish single-cell data. I have created a gene vs motif ranking database (.feather) by following create_cisTarget_databases repository.

According to this comment, to create Motif2TF database, I should follow Janky et al 2014 's " PWM-based whole-genome rankings across species" in the Material and method.

I have a fasta file containing zebrafish 10k bp up/down sequences of 5'UTR, and JASAR motif databases with Cluster Baster format, both of which were used for generating my cisTarget ranking database. If I understand correctly, the potential workflow is... 1. Get orthologous regions for my zebrafish fasta file from 7-10 other vertebrates, 2. Run cluster buster using JASPAR motif, fasta file (I am not sure which species fasta file I should use), 3. Evaluate q-value. Is that right?

I might misunderstand something, but I would be really grateful if you could provide me more details for creating a motif annotation database in species other than humans, mice, and Drosophia. Any suggestion would be appreciated.

Thank you!

stanaka6 avatar Aug 15 '21 21:08 stanaka6