pySCENIC icon indicating copy to clipboard operation
pySCENIC copied to clipboard

What's the meaning of the index in the *genes_vs_motifs.ranking.feather

Open shangguandong1996 opened this issue 2 years ago • 0 comments

Hi, I am running pyscenic ctx and it worked.

pyscenic ctx adj.csv ../related_info/cisTarget_Ath/Athaliana.genes_vs_motifs.rankings.feather --annotations_fname ../related_info/cisTarget_Ath/JASPAR_plantTFDB_Ath.tbl --expression_mtx_fname result/processed_result/CIM1d_rep2.loom --mask_dropouts --num_workers 40 --output reg.csv

But I have a question about the index of genes_vs_motifs.rankings.feather

Here is my Athaliana.genes_vs_motifs.rankings.feather like:

genes_vs_motifs_ctx_db = './Athaliana.genes_vs_motifs.rankings.feather'

genes_vs_motifs_ctx_df = pf.read_feather(genes_vs_motifs_ctx_db)

genes_vs_motifs_ctx_df

image

I also try to use R to read feather

a <- read_feather("~/newReference/annoation/Athaliana/motif/cisTarget_Ath/Athaliana.genes_vs_motifs.rankings.feather")
> a
# A tibble: 661 × 37,331
   AT1G01010 AT1G01020 AT1G01030 AT1G01040 AT1G01050 AT1G01060 AT1G01070 AT1G01080
       <int>     <int>     <int>     <int>     <int>     <int>     <int>     <int>
 1      1755      2209     13393     20635     36963     33947      1978      1691
 2       455       564     26683     25197     11146     35123     35260     34320
 3     24111      2042      3707      2610     15719      9523      8855     17028
 4     36829      7147     16038     12354     31349     15739     16870     30670
 5     22355     35182     34154     14973     15130     26412     23529     32193
 6       642       766     27995     20114     32063     33263      8411      7344
 7      7210      9964      5864     33130     15641     12865     11679     27762
 8     26136     25670     13449     23649     21001     26306     24560     29765
 9     25319       912       590     19959      9574      8489      8000     17006
10      9084      6357      2270     33386      8095      6164      5606     21834
# … with 651 more rows, and 37,323 more variables: AT1G01090 <int>, AT1G01100 <int>,
#   AT1G01110 <int>, AT1G01120 <int>, AT1G01130 <int>, AT1G01140 <int>, AT1G01150 <int>,
#   AT1G01160 <int>, AT1G01170 <int>, AT1G01180 <int>, AT1G01190 <int>, AT1G01200 <int>,
#   AT1G01210 <int>, AT1G01220 <int>, AT1G01225 <int>, AT1G01230 <int>, AT1G01240 <int>,
#   AT1G01250 <int>, AT1G01260 <int>, AT1G01270 <int>, AT1G01280 <int>, AT1G01290 <int>,
#   AT1G01300 <int>, AT1G01305 <int>, AT1G01310 <int>, AT1G01320 <int>, AT1G01335 <int>,
#   AT1G01340 <int>, AT1G01350 <int>, AT1G01355 <int>, AT1G01360 <int>, …

But it seems that there is no motif_id in the feather file, but only have 0,1,2,3 index. I am wondering how the pyscenic ctx know which motif is ?

Please forgive me if I misunderstand something :)

Best wishes Guandong Shang

shangguandong1996 avatar Oct 10 '22 05:10 shangguandong1996