Pando icon indicating copy to clipboard operation
Pando copied to clipboard

Error:infer_grn(),

Open wxpbioinfo opened this issue 1 year ago • 5 comments

Hi,When I was running this function, I encountered the following error, I checked my motif matrix and gene name, but I did not find a number beginning, I feel very confused, can you give me the answer? image This is my code:

scARC=readRDS("./Data/scARC_celltype.rds") DefaultAssay(scARC) <- "peaks" seqlevelsStyle(BSgenome.Mmulatta.UCSC.rheMac10) <- 'Ensembl' scARC <- initiate_grn(scARC, rna_assay = 'RNA',peak_assay = 'peaks') pwm_set <- getMatrixSet(x = JASPAR2022, opts = list(species = 9606, all_versions = FALSE))

plan("multisession", workers = 20) #查找 TF 结合位点 scARC <- find_motifs(scARC,pfm = pwm_set,genome = BSgenome.Mmulatta.UCSC.rheMac10) #推断 GRN genes <- scARC@assays[["RNA"]]@var.features filtered_text <- grep("1_.", x, value=TRUE) genes <- genes[!grepl("^ID3.", genes)] scARC <- infer_grn(scARC,genes=genes,peak_to_gene_method = 'Signac',method = 'glm') plan("sequential") coef(scARC)

wxpbioinfo avatar Jun 24 '23 08:06 wxpbioinfo

Hello @joschif Thank you for the detailed tutorials! I have a similar issue to the one reported above. I followed the tutorials and at this part of the code, I got an error when trying to infer the grn for highly variable genes. Removing the genes argument below did not help.

Package versions:

print(packageVersion("Seurat")) [1] '5.0.3' print(packageVersion("SeuratObject")) [1] '5.0.1' print(packageVersion("Pando")) [1] '1.1.1'

library(doParallel)
registerDoParallel(4)
muo_data <- infer_grn(
  muo_data,
  peak_to_gene_method = 'GREAT',
  genes=top_variable_genes,
  verbose=2,
  tf_cor=0,
  #genes = patterning_genes$symbol
  parallel = T
)

Here is my error:

Selecting candidate regulatory regions near genes Preparing model input Fitting models for 1525 target genes Error in { : task 3 failed - "x and y should have the same number of rows"

I have tried many possible ways to solve this but I have not succeeded. Would you please help?

> muo_data
An object of class "GRNData"
Slot "grn":
A RegulatoryNetwork object based on 1136 transcription factors


No network has been inferred

Slot "data":
An object of class Seurat 
128093 features across 1136 samples within 2 assays 
Active assay: peaks (91492 features, 0 variable features)
 2 layers present: counts, data
 1 other assay present: RNA

I have my RNA and ATAC data as follows:

> coembed <- merge(x = pbmc_atac_filtered, y = rna_seurat)
> print(coembed)
An object of class Seurat 
128093 features across 1136 samples within 2 assays 
Active assay: peaks (91492 features, 0 variable features)
 2 layers present: counts, data
 1 other assay present: RNA
> coembed[['RNA']]
Assay (v5) data with 36601 features for 579 cells
Top 10 variable features:
 CXCL8, HIST1H2AC, AFF3, NRG1, PDE4D, IL1B, EREG, AL163541.1, ADGRB3, NEGR1 
Layers:
 counts, data 
> coembed[['peaks']]
ChromatinAssay data with 91492 features for 557 cells
Variable features: 0 
Genome: 
Annotation present: TRUE 
Motifs present: FALSE 
Fragment files: 0 

> muo_data <- initiate_grn(
  coembed,
  rna_assay = 'RNA',
  peak_assay = 'peaks',
  regions = phastConsElements20Mammals.UCSC.hg38 
)

I see I have 579 cells in RNA, but 557 in ATAC. I troubleshoot and updated this in another comment below.

Thank you very much. Elham

elhaam avatar Apr 18 '24 03:04 elhaam

Hello @joschif

I am updating this issue. I tried keeping common cells within both assays so now both my RNA and ATAC data have 557 cells. The error I get changed as follows.

> registerDoParallel(4)
> muo_data <- infer_grn(
+   muo_data,
+   peak_to_gene_method = 'Signac', #GREAT',
+   genes=top_variable_genes,
+   verbose=2,
+   tf_cor=0,
+   #genes = patterning_genes$symbol
+   parallel = T
+ )

Loaded glmnet 4.1-8 Selecting candidate regulatory regions near genes Preparing model input Fitting models for 1525 target genes Error in { : task 3 failed - ""CRsparse_colSums" not resolved from current namespace (Matrix)"

Would you please let me know if you have any suggestions? Thank you so much in advance.

elhaam avatar Apr 18 '24 14:04 elhaam

Hi @elhaam, unfortunately it's very hard to tell what the exact problem is here. However, it seems to stem not from the Pando code itself but from the Matrix package. Maybe you can try updating it or installing a different version.

joschif avatar Apr 18 '24 15:04 joschif

Thanks @joschif! Yes, this is correct that Matrix package was problematic. Following this solution and this one worked for me if anyone faced this issue in the future. Also, I made sure you have the correct version of Bioconductor based on this issue on Seurat.

elhaam avatar Apr 18 '24 21:04 elhaam

Same issue as @wxpbioinfo. I have tried with a few genes and I still get the error with genes other than ‘RSPO4’, but the same ‘20_’. The only thing different from the tutorial is the use of NCBI peak name style. Any suggestions for not having to do the preprocessing with the USCS style (because of the impossibility to change it in the seurat object). Very nice package by the way!

> grn_object <- infer_grn(grn_object, peak_to_gene_method = 'Signac', method = 'glm', verbose = T) 
Selecting candidate regulatory regions near genes 
Preparing model input 
Fitting models for 1278 target genes  
|+++                                               | 4 % ~01m 10s      Error en str2lang(x): <text>:1:11: unexpected input
1: RSPO4 ~ 20_
              ^

damouzo avatar Sep 04 '24 14:09 damouzo