pySCENIC icon indicating copy to clipboard operation
pySCENIC copied to clipboard

Error in picking out modules for iRegulon: print( f'{tf} regulon: {len(regulons[tf])} genes' ) KeyError: 'TF name'

Open stanaka6 opened this issue 2 years ago • 1 comments

Hi,

Thank you very much for the wonderful tool. I would like to create module lists for iRegulon analysis. However, I'm getting an error message during the module pick-up step. Could you please help me to solve the issue? I'm not exactly sure why, but my output loom file from singularity aertslab-pyscenic-0.12.1.sif doesn't contain regulon information and au_mtx, so I load regulons using load_signatures and convert it to dictionaries.

# path to loom output from pyscenic aucell
f_final_loom = 'mySCENIC.loom'

# Extract expression matrix
lf = lp.connect(f_final_loom, mode='r', validate=False)
exprMat = pd.DataFrame(lf[:,:], index=lf.ra.Gene, columns=lf.ca.CellID).T
lf.close()

# Load regulons
regulons = load_signatures('regulons_filtered.csv')

# Convert list of regulons tp dictionaly 
def Convert(a):
    it = iter(a)
    res_dct = dict(zip(it, it))
    return res_dct
regulons = Convert(regulons)

# Load adjacencies 
adjacencies = pd.read_csv("expr_mat.adjacencies_filtered.tsv", index_col=False, sep='\t')

# Create modules
from pyscenic.utils import modules_from_adjacencies
modules = list(modules_from_adjacencies(adjacencies, exprMat))

tf = 'Sox5'
tf_mods = [ x for x in modules if x.transcription_factor==tf ]

# pick out modules for Esr1:
for i,mod in enumerate( tf_mods ):
    print( f'{tf} module {str(i)}: {len(mod.genes)} genes' )
print( f'{tf} regulon: {len(regulons[tf])} genes' )

# write these modules, and the regulon to files:
for i,mod in enumerate( tf_mods ):
    with open( tf+'_module_'+str(i)+'.txt', 'w') as f:
        for item in mod.genes:
            f.write("%s\n" % item)
            
with open( tf+'_regulon.txt', 'w') as f:
    for item in regulons[tf]:
        f.write("%s\n" % item)

Session information updated at 2022-12-29 1:25 Create regulons from a dataframe of enriched features. Additional columns saved: [] Sox5 module 0: 2204 genes Sox5 module 1: 1791 genes Sox5 module 2: 51 genes Sox5 module 3: 641 genes Sox5 module 4: 1045 genes Sox5 module 5: 2236 genes Traceback (most recent call last): File "/MyScript_pyscenic_for_iRegulon.py", line 49, in print( f'{tf} regulon: {len(regulons[tf])} genes' ) KeyError: 'Sox5'

When I changed regulons[tf] to regulons[tf+"(+)"])}, I received the similar error KeyError: 'Sox5(+)'. Also, I tried to use different genes but received the same error. I'm assuming how I load regulon information is wrong, but I couldn't figure out the solution.

  • pySCENIC version: 0.12.1
  • Installation method: conda
  • Run environment: CLI
  • OS: Linux
  • Python: 3.10.8
  • Package versions: pandas 1.5.2

Any suggestions & comments would be very much appreciated.

Thank you!

stanaka6 avatar Dec 30 '22 02:12 stanaka6

Is there anyone who solve this? I got same error too. I also modified regulons[tf+_"(+)"] to regulons[tf+"(+)"], but still got same error.

Moonju411 avatar Jul 20 '23 06:07 Moonju411