seurat
seurat copied to clipboard
viral custom reference, gene disappears during integration
I ran two samples, one was infected and one was not infected. I used the same reference for both of them, however when I try to look up the viral genes in the integrated object, I get the error:
Error in if (unique.feature.exp == 0) { :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: Could not find MKPX31222-32181 in the default search locations, found in ‘RNA’ assay instead
2: Removing 2969 cells missing data for features requested
3: All cells have the same value (NA) of “rna_MKPX31222-32181”
if I run the two samples individually and make individual UMAPs, the infected is able to produce a featurePlot of the viral genes. The mock is not able. However if I look at the H5 file, I see the genes in the mock h5 file so I know they are there- how can I make sure that they stay for later downstream analysis? And why would it remove genes that are not shown in one sample but are shown in the other during integration- as these are likely the most important to differentiate between the two subsets.
Here is my code
path<-"path-mock-to-filtered_feature_bc_matrix"
mock.data <- Read10X(data.dir = path)
mock <- CreateSeuratObject(counts = mock.data, project = "mock", min.cells = 3, min.features = 200)
mock[["percent.mt"]] <- PercentageFeatureSet(mock, pattern = "^MT-")
mock <- subset(mock, subset = nFeature_RNA > 200 & nFeature_RNA < 5000 & percent.mt < 20)
mock <- NormalizeData(mock, normalization.method = "LogNormalize", scale.factor = 10000)
all.genes <- rownames(mock)
mock <- ScaleData(mock, features = all.genes)
mock <- FindVariableFeatures(object = mock)
[email protected]$batch = "mock"
path<-"path-viral-to-filtered_feature_bc_matrix"
viral.data <- Read10X(data.dir = path)
viral <- CreateSeuratObject(counts = viral.data, project = "viral", min.cells = 3, min.features = 200)
viral[["percent.mt"]] <- PercentageFeatureSet(viral, pattern = "^MT-")
viral <- subset(viral, subset = nFeature_RNA > 200 & nFeature_RNA < 5000 & percent.mt < 20)
viral <- NormalizeData(viral, normalization.method = "LogNormalize", scale.factor = 10000)
all.genes <- rownames(viral)
viral <- ScaleData(viral, features = all.genes)
viral <- FindVariableFeatures(object = viral)
[email protected]$batch = "viral"
combined <- merge(mock, y = c(viral), project = "august22")
combined.list <- SplitObject(object = combined, split.by = "batch")
reference.list <- combined.list[c('mock','viral')]
combined.anchors <- FindIntegrationAnchors(object.list = reference.list, dims = 1:30)
combined.integrated <- IntegrateData(anchorset = combined.anchors, dims = 1:30)
library(ggplot2)
library(cowplot)
DefaultAssay(combined.integrated) <- "integrated"
combined.integrated <- ScaleData(combined.integrated, verbose = FALSE)
combined.integrated <- RunPCA(combined.integrated, npcs = 30, verbose = FALSE)
combined.integrated <- RunUMAP(combined.integrated, reduction = "pca", dims = 1:30)
DimPlot(combined.integrated, reduction = "umap")
FeaturePlot(combined.integrated, features = c("MKPX31222-32181"), split.by = "batch")
#error here^
Thank you!
Hello,
Can you please confirm that your gene of interest is present in your mock dataset? I'm not sure what you mean by "The mock is not able". What is the output of the following?
sum(LayerData(mock, features = "MKPX31222-32181", layer="counts"))
If this feature isn't present in the mock dataset, you may trying reducing the min.cells
parameter of CreateSeuratObject
, in case this gene has less than 3 counts in your mock dataset.
mock <- CreateSeuratObject(counts = mock.data, project = "mock", min.cells = 0, min.features = 200)
Hi @jimena2
By setting the min.cells = 0 as suggested by mhkowalski you should be able to keep these low-ct genes in your mock object (since the viral genes should have 0 count in your mock sample).
We are closing this post for now and please feel free to re-open it if the issue persists or there is any update.