monocle-release <simpleError in checkwz(wz, M = M, trace = trace, wzepsilon = control$wzepsilon): NAs found in the working weights variable 'wz'>

Hi Cole and Xiaojie, I was running plot_pseudotime_heatmap. And I got that error message although it indeed gave me a figure. The error message is: <simpleError in checkwz(wz, M = M, trace = trace, wzepsilon = control$wzepsilon): NAs found in the working weights variable 'wz'>

It looks like some of your demonstration examples also have that.

And I also noticed that not running estimateDispersion() can generate the same error message without giving me a heatmap.

Any ideas?

I've been running Monocle2

Apr 28 '19 14:04 brianpenghe

same error to me.

Jun 16 '19 08:06 MichaelPeibo

Hi @brianpenghe I have been using integrated data from seurat, which I guess is the cause of the error. I replaced exprs(cds) with raw counts, then I got through the function.

However, I don't know if raw counts are ok to plot the heatmap.

Jun 16 '19 09:06 MichaelPeibo

@MichaelPeibo, do you have both integrated and original RNA counts within your single CDS?

Jun 19 '19 03:06 dwucsf

@dwucsf No, I don't find slot for storing log normalized data or corrected data, so I only replace it.

Jun 19 '19 04:06 MichaelPeibo

Do you use the corrected, integrated data to plot your pseudotime trajectory and then replace the slot with original RNA count data to do downstream analyses?

Jun 19 '19 04:06 dwucsf

Yes,I use the corrected, integrated data to plot your pseudotime trajectory(different normalization method produce different trajectory, monocle2's lognorm and vst, so I use Seurat's corrected data).

Jun 19 '19 04:06 MichaelPeibo

@MichaelPeibo, can I see your code for how you are constructing your CDS with integrated data, running monocle to get your pseudotime trajectory, and then replacing the integrated counts with original RNA counts within your CDS?

Jun 19 '19 19:06 dwucsf

expr_matrix=object@[email protected] # you can use scale.data or data or counts slot, use counts for original RNA counts
expr_matrix=as.matrix(expr_matrix)
### due to different order of meta info, we need subset dataset from ori, then filter
[email protected] 
gene_annotations=as.data.frame(rownames(object@[email protected]))
colnames(gene_annotations)='gene_short_name'
rownames(gene_annotations)=rownames(object@[email protected])
pd <- new("AnnotatedDataFrame", data = meta_info)
fd <- new("AnnotatedDataFrame", data = gene_annotations)
identical(colnames(expr_matrix), rownames(pd))
cds<- newCellDataSet(as.matrix(expr_matrix),
    phenoData = pd, featureData = fd)

cds <- estimateSizeFactors(cds)
cds <- estimateDispersions(cds)
cds <- setOrderingFilter(cds, gene_set) #gene_set can be used from you seurat FindMarker()
cds <- reduceDimension(cds, max_components = 2, method = 'DDRTree', norm_method = 'none',pseudo_expr=0) # this step not using monocle's normalization method
cds <- orderCells(cds)

NOTE: I am really the fresh to this pseudotime analysis and I am not sure how to refine some parameters to get a nice trajectory, please take this code critically and share me with your idea if you would like.

Jun 20 '19 00:06 MichaelPeibo

@MichaelPeibo Hi, thank you for sharing your code. I also used similar code to import the integrated data to monocle to do pseudotime analysis.And I also found that the integrated normalized data can cause the error.But how do you replace exprs(cds) with raw counts and which function do you use?I got raw counts from seurat object and tried to resign it to exprs(monocle cds),but I failed.Below is my error.

exprs(cds)[,] <- S4@assays$RNA@data[S4@[email protected],][,]
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘exprs<-’ for signature ‘"CellDataSet", "dgCMatrix"’

Could you show me your code?Thanks a lot.

Jul 07 '19 14:07 Sophia409

expr_matrix=object@[email protected] # you can use scale.data or data or counts slot, use counts for original RNA counts
expr_matrix=as.matrix(expr_matrix)
### due to different order of meta info, we need subset dataset from ori, then filter
[email protected] 
gene_annotations=as.data.frame(rownames(object@[email protected]))
colnames(gene_annotations)='gene_short_name'
rownames(gene_annotations)=rownames(object@[email protected])
pd <- new("AnnotatedDataFrame", data = meta_info)
fd <- new("AnnotatedDataFrame", data = gene_annotations)
identical(colnames(expr_matrix), rownames(pd))
cds<- newCellDataSet(as.matrix(expr_matrix),
    phenoData = pd, featureData = fd)

cds <- estimateSizeFactors(cds)
cds <- estimateDispersions(cds)
cds <- setOrderingFilter(cds, gene_set) #gene_set can be used from you seurat FindMarker()
cds <- reduceDimension(cds, max_components = 2, method = 'DDRTree', norm_method = 'none',pseudo_expr=0) # this step not using monocle's normalization method
cds <- orderCells(cds)

NOTE: I am really the fresh to this pseudotime analysis and I am not sure how to refine some parameters to get a nice trajectory, please take this code critically and share me with your idea if you would like.

@Sophia409 just the first line of code, expr_matrix=object@[email protected] for scale.data expr_matrix=object@assays$integrated@counts for raw, or you may try to extract raw counts from RNA slot

Jul 08 '19 00:07 MichaelPeibo

@MichaelPeibo Hi, maybe I didn't express it clearly.I can use raw counts as expr_matrix to make a new celldataset.But this can't solve the problem. I need to import Seurat3 integrated data to monocle to do pseudotime analysis.But the integrated normalized data can only be used to do pseudotime analysis ,not available for any comparison in different conditons.(satijalab/seurat#1674). Besides,as you said,the normalized data cause some errors in some function(such as plot_pseudotime_heatmap),because it onle accept the raw counts. So I want to replace exprs(cds) with raw counts after ordercells by using normalized data.Then I can plot genes pseudotime dynamics among different branches and states without altering these branches and states information built on the integrated data. I got raw counts from seurat object and tried to replace exprs(monocle cds),but it failed.You said you replaced exprs(cds) with raw counts,so I mistakenly assume that you replaced the matrix midway instead of importing a newcelldataset. In fact,two assay can solve this problem,but monocle2 has no this function and they don't plan to add it at present. I'm glad to hear your idea.

Jul 08 '19 14:07 Sophia409

@MichaelPeibo Hi, maybe I didn't express it clearly.I can use raw counts as expr_matrix to make a new celldataset.But this can't solve the problem. I need to import Seurat3 integrated data to monocle to do pseudotime analysis.But the integrated normalized data can only be used to do pseudotime analysis ,not available for any comparison in different conditons.(satijalab/seurat#1674). Besides,as you said,the normalized data cause some errors in some function(such as plot_pseudotime_heatmap),because it onle accept the raw counts. So I want to replace exprs(cds) with raw counts after ordercells by using normalized data.Then I can plot genes pseudotime dynamics among different branches and states without altering these branches and states information built on the integrated data. I got raw counts from seurat object and tried to replace exprs(monocle cds),but it failed.You said you replaced exprs(cds) with raw counts,so I mistakenly assume that you replaced the matrix midway instead of importing a newcelldataset. In fact,two assay can solve this problem,but monocle2 has no this function and they don't plan to add it at present. I'm glad to hear your idea.

Now I understand what you described, I remembered I tried at first time in the midway of analysis, but somehow it did not work.

Jul 10 '19 06:07 MichaelPeibo

Have you solved this problem? And I meet the same one

Aug 07 '21 14:08 ChaoXu1997

I solve this problem by decreasing the gene used for doing heatmap from 377(all of the DEG) to Top 50. Maybe the number of the genes, or some specific gene cause this problem.

Oct 21 '23 12:10 Wangzy00

I found the same issue while working with spatial datasets. However, I realized that a lot of the genes, for some reason, in the differentialGeneTest function had the num_cells_expressed equals to 0. For example:

diff_test_res <- differentialGeneTest(HSMM_filtered, fullModelFormulaStr="~sm.ns(Pseudotime)")
sort(diff_test_res$num_cells_expressed)


 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [58] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [115] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [172] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [229] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [286] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [343] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [400] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [457] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [514] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [571] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [628] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [685] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [742] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [799] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [856] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [913] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [970] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [ reached getOption("max.print") -- omitted 16943 entries ]

For some reason, the removal of genes expressed in an arbitrary number of cells were not removed in early steps. So, I just did an easy filtering by stablishing a minimum of 10 cells.

library(tidyverse)

diff_test_res <- diff_test_res %>%
  filter(num_cells_expressed > 10)

Then, proceeded with some pvalue filtering and the plot ran without errors:

diff <- new[,c("gene_short_name", "pval", "qval","use_for_ordering")]
diff <- diff[diff$pval < 0.05, ]
diff <- diff[order(diff$qval),]
genelist <- row.names(diff)

heatmap <- plot_pseudotime_heatmap(HSMM[genelist], num_clusters=4, show_rownames=T, return_heatmap=T)

I hope it helps!

Nov 15 '23 21:11 dimitrisokolowskei

I found the same issue while working with spatial datasets. However, I realized that a lot of the genes, for some reason, in the differentialGeneTest function had the num_cells_expressed equals to 0. For example:

diff_test_res <- differentialGeneTest(HSMM_filtered, fullModelFormulaStr="~sm.ns(Pseudotime)")
sort(diff_test_res$num_cells_expressed)


 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [58] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [115] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [172] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [229] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [286] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [343] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [400] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [457] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [514] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [571] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [628] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [685] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [742] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [799] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [856] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [913] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [970] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [ reached getOption("max.print") -- omitted 16943 entries ]

For some reason, the removal of genes expressed in an arbitrary number of cells were not removed in early steps. So, I just did an easy filtering by stablishing a minimum of 10 cells.

library(tidyverse)

diff_test_res <- diff_test_res %>%
  filter(num_cells_expressed > 10)

Then, proceeded with some pvalue filtering and the plot ran without errors:

diff <- new[,c("gene_short_name", "pval", "qval","use_for_ordering")]
diff <- diff[diff$pval < 0.05, ]
diff <- diff[order(diff$qval),]
genelist <- row.names(diff)

heatmap <- plot_pseudotime_heatmap(HSMM[genelist], num_clusters=4, show_rownames=T, return_heatmap=T)

I hope it helps!

thanks you. I solved the same issue through this way

Apr 23 '24 06:04 NGUaqr

monocle-release monocle-release copied to clipboard

<simpleError in checkwz(wz, M = M, trace = trace, wzepsilon = control$wzepsilon): NAs found in the working weights variable 'wz'>

monocle-release
monocle-release copied to clipboard