seurat Mismatch results between seurat data using ENSEBML gene id and mouse gene symbol

Mismatch results between seurat data using ENSEBML gene id and mouse gene symbol

Open karouro opened this issue 2 years ago • 3 comments

Hi. I'm a beginner of scRNAseq analysis. I have four scRNAseq count data of mice cells generated from ICell8 system. Each count data include 55,367 unique gene features. Then, using ENSEMBL gene id and mouse gene symbol respectively, I created seurat object, integrated the data, and performed the integrated analysis according to the vignette.

The both seurat objects using ENSEMBL gene id and mouse gene symbol are the same except for the feature name. However, the results of the integrated analysis are not the same. Could you please give me some help?

seurat_Tcell.combined <- ScaleData(seurat_Tcell.combined, verbose = FALSE) seurat_Tcell.combined <- RunPCA(seurat_Tcell.combined, npcs = 50, verbose = FALSE) seurat_Tcell.combined <- RunUMAP(seurat_Tcell.combined, reduction = "pca", dims = 1:15) 16:43:00 UMAP embedding parameters a = 0.9922 b = 1.112 16:43:00 Read 4545 rows and found 15 numeric columns 16:43:00 Using Annoy for neighbor search, n_neighbors = 30 16:43:00 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 16:43:00 Writing NN index file to temp file /tmp/Rtmpl1OVKq/file24960c18110 16:43:00 Searching Annoy index using 1 thread, search_k = 3000 16:43:01 Annoy recall = 100% 16:43:02 Commencing smooth kNN distance calibration using 1 thread 16:43:05 Initializing from normalized Laplacian + noise 16:43:05 Commencing optimization for 500 epochs, with 192558 positive edges 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 16:43:10 Optimization finished seurat_Tcell.combined <- RunTSNE(seurat_Tcell.combined, reduction = "pca", dims = 1:15, seed.use = 1,

                             perplexity = 30, max_iter = 1000, theta = 0.5, eta = 200, num_threads = 0)

seurat_Tcell.combined <- FindNeighbors(seurat_Tcell.combined, reduction = "pca", dims = 1:15) Computing nearest neighbor graph Computing SNN seurat_Tcell.combined <- FindClusters(seurat_Tcell.combined, resolution = 1.0) Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 4545 Number of edges: 175092

Running Louvain algorithm... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Maximum modularity in 10 random starts: 0.7511 Number of communities: 11 Elapsed time: 0 seconds

seurat_Tcell2.combined <- ScaleData(seurat_Tcell2.combined, verbose = FALSE) seurat_Tcell2.combined <- RunPCA(seurat_Tcell2.combined, npcs = 50, verbose = FALSE) seurat_Tcell2.combined <- RunUMAP(seurat_Tcell2.combined, reduction = "pca", dims = 1:15) 16:43:34 UMAP embedding parameters a = 0.9922 b = 1.112 16:43:34 Read 4545 rows and found 15 numeric columns 16:43:34 Using Annoy for neighbor search, n_neighbors = 30 16:43:34 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 16:43:34 Writing NN index file to temp file /tmp/Rtmpl1OVKq/file2496e0c4115 16:43:34 Searching Annoy index using 1 thread, search_k = 3000 16:43:35 Annoy recall = 100% 16:43:36 Commencing smooth kNN distance calibration using 1 thread 16:43:38 Initializing from normalized Laplacian + noise 16:43:38 Commencing optimization for 500 epochs, with 192538 positive edges 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 16:43:43 Optimization finished seurat_Tcell2.combined <- RunTSNE(seurat_Tcell2.combined, reduction = "pca", dims = 1:15, seed.use = 1,

                              perplexity = 30, max_iter = 1000, theta = 0.5, eta = 200, num_threads = 0)

seurat_Tcell2.combined <- FindNeighbors(seurat_Tcell2.combined, reduction = "pca", dims = 1:15) Computing nearest neighbor graph Computing SNN seurat_Tcell2.combined <- FindClusters(seurat_Tcell2.combined, resolution = 1.0) Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 4545 Number of edges: 176550

Running Louvain algorithm... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Maximum modularity in 10 random starts: 0.7521 Number of communities: 10 Elapsed time: 0 seconds

seurat_Tcell.combined is a seurat object that have ENSEMBL id, and seurat_Tcell2.combined is a one that have gene symbol. I confirmed that there are no duplications in gene features.

Thanks!

Apr 04 '22 15:04 karouro

Are the input matrices exactly the same? You can generally have multiple gene ids associated with a gene symbol. Also, the only difference I can see above is in the number of clusters (which can be for multiple reasons including starting from different number of input features). If you can provide a more complete code example (with plots), that would help.

Apr 22 '22 16:04 saketkc

Thank you very much for your reply The input matrices are exactly the same, one has ENSEMBL gene id and one has gene symbol This is my code.

seurat_1 An object of class Seurat 55367 features across 4545 samples within 1 assay Active assay: RNA (55367 features, 0 variable features) rownames(seurat_1) [1] "ENSMUSG00000102693" "ENSMUSG00000064842" "ENSMUSG00000051951" "ENSMUSG00000102851" "ENSMUSG00000103377" "ENSMUSG00000104017"

seurat_2 An object of class Seurat 55367 features across 4545 samples within 1 assay Active assay: RNA (55367 features, 0 variable features) rownames(seurat_2) [1] "4933401J01Rik" "Gm26206" "Xkr4" "Gm18956" "Gm37180" "Gm37363" "Gm37686" "Gm1992"

mito.genes <- rownames(seurat_1)[c(55255, 55257, 55259, 55263, 55269, 55272, 55274:55276, 55278, 55280, 55281, 55285, 55286, 55288)] seurat_1[["percent.mt"]] <- PercentageFeatureSet(seurat_1, features = mito.genes, assay = "RNA") mito.genes2 <- rownames(seurat_2)[c(55255, 55257, 55259, 55263, 55269, 55272, 55274:55276, 55278, 55280, 55281, 55285, 55286, 55288)] seurat_2[["percent.mt"]] <- PercentageFeatureSet(seurat_2, features = mito.genes2, assay = "RNA") seurat_1 <- subset(seurat_1 , subset = nFeature_RNA > 200 & nFeature_RNA < 4000 & percent.mt < 5) seurat_2 <- subset(seurat_2 , subset = nFeature_RNA > 200 & nFeature_RNA < 4000 & percent.mt < 5)

mito.genes <- rownames(seurat_1)[c(55255, 55257, 55259, 55263, 55269, 55272, 55274:55276, 55278, 55280, 55281, 55285, 55286, 55288)] seurat_1[["percent.mt"]] <- PercentageFeatureSet(seurat_1, features = mito.genes, assay = "RNA") mito.genes2 <- rownames(seurat_2)[c(55255, 55257, 55259, 55263, 55269, 55272, 55274:55276, 55278, 55280, 55281, 55285, 55286, 55288)] seurat_2[["percent.mt"]] <- PercentageFeatureSet(seurat_2, features = mito.genes2, assay = "RNA") seurat_1 <- subset(seurat_1 , subset = nFeature_RNA > 200 & nFeature_RNA < 4000 & percent.mt < 5) seurat_2 <- subset(seurat_2 , subset = nFeature_RNA > 200 & nFeature_RNA < 4000 & percent.mt < 5) seurat_1.list <- SplitObject(seurat_1, split.by = "Chip") seurat_2.list <- SplitObject(seurat_2, split.by = "Chip")

normalize and identify variable features for each dataset independently

seurat_1.list <- lapply(X = seurat_1.list, FUN = function(x) {

x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 3000)
}) Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************|

seurat_2.list <- lapply(X = seurat_2.list, FUN = function(x) {

x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 3000)
}) Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************|

select features that are repeatedly variable across datasets for integration run PCA on each dataset using these features

seurat_1.features <- SelectIntegrationFeatures(object.list = seurat_1.list) seurat_1.list <- lapply(X = seurat_1.list, FUN = function(x) {

x <- ScaleData(x, features = seurat_1.features, verbose = FALSE)
x <- RunPCA(x, features = seurat_1.features, verbose = FALSE)
})

seurat_1.anchors <- FindIntegrationAnchors(object.list = seurat_1.list,

                                           anchor.features = seurat_1.features, reduction = "rpca")

Scaling features for provided objects |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
Computing within dataset neighborhoods |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Finding all pairwise anchors | | 0 % ~calculating Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 185 anchors |++++ | 7 % ~10s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 187 anchors |+++++++ | 13% ~09s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 140 anchors |++++++++++ | 20% ~09s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 149 anchors |++++++++++++++ | 27% ~08s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 137 anchors |+++++++++++++++++ | 33% ~07s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 225 anchors |++++++++++++++++++++ | 40% ~07s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 154 anchors |++++++++++++++++++++++++ | 47% ~06s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 155 anchors |+++++++++++++++++++++++++++ | 53% ~05s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 172 anchors |++++++++++++++++++++++++++++++ | 60% ~05s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 174 anchors |++++++++++++++++++++++++++++++++++ | 67% ~04s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 170 anchors |+++++++++++++++++++++++++++++++++++++ | 73% ~03s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 148 anchors |++++++++++++++++++++++++++++++++++++++++ | 80% ~02s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 167 anchors |++++++++++++++++++++++++++++++++++++++++++++ | 87% ~02s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 159 anchors |+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~01s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 281 anchors |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=12s

seurat_1.combined <- IntegrateData(anchorset = seurat_1.anchors) Merging dataset 3 into 4 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 6 into 5 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 2 into 1 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 4 3 into 5 6 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 1 2 into 5 6 4 3 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data

seurat_2.features <- SelectIntegrationFeatures(object.list = seurat_2.list) seurat_2.list <- lapply(X = seurat_2.list, FUN = function(x) {

x <- ScaleData(x, features = seurat_2.features, verbose = FALSE)
x <- RunPCA(x, features = seurat_2.features, verbose = FALSE)
})

seurat_2.anchors <- FindIntegrationAnchors(object.list = seurat_2.list,

                                            anchor.features = seurat_2.features, reduction = "rpca")

Scaling features for provided objects |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
Computing within dataset neighborhoods |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Finding all pairwise anchors | | 0 % ~calculating Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 185 anchors |++++ | 7 % ~10s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 187 anchors |+++++++ | 13% ~10s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 140 anchors |++++++++++ | 20% ~09s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 149 anchors |++++++++++++++ | 27% ~08s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 137 anchors |+++++++++++++++++ | 33% ~07s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 225 anchors |++++++++++++++++++++ | 40% ~07s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 153 anchors |++++++++++++++++++++++++ | 47% ~06s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 155 anchors |+++++++++++++++++++++++++++ | 53% ~05s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 172 anchors |++++++++++++++++++++++++++++++ | 60% ~05s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 174 anchors |++++++++++++++++++++++++++++++++++ | 67% ~04s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 171 anchors |+++++++++++++++++++++++++++++++++++++ | 73% ~03s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 148 anchors |++++++++++++++++++++++++++++++++++++++++ | 80% ~02s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 167 anchors |++++++++++++++++++++++++++++++++++++++++++++ | 87% ~02s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 159 anchors |+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~01s Projecting new data onto SVD Projecting new data onto SVD Finding neighborhoods Finding anchors Found 281 anchors |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=12s

seurat_2.combined <- IntegrateData(anchorset = seurat_2.anchors) Merging dataset 3 into 4 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 6 into 5 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 2 into 1 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 4 3 into 5 6 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data Merging dataset 1 2 into 5 6 4 3 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Integrating data

Seurat object include 2 assays "RNA" and "integrated". "integrated" is used for subsequent analysis

DefaultAssay(seurat_1.combined) <- "integrated" DefaultAssay(seurat_2.combined) <- "integrated"

Standard workflow

seurat_1.combined <- ScaleData(seurat_1.combined, verbose = FALSE) seurat_1.combined <- RunPCA(seurat_1.combined, npcs = 50, verbose = FALSE)

seurat_2.combined <- ScaleData(seurat_2.combined, verbose = FALSE) seurat_2.combined <- RunPCA(seurat_2.combined, npcs = 50, verbose = FALSE)

Run unlinear dimension reduction and create cluster

seurat_1.combined <- RunUMAP(seurat_1.combined, reduction = "pca", dims = 1:15) 12:48:33 UMAP embedding parameters a = 0.9922 b = 1.112 12:48:33 Read 4545 rows and found 15 numeric columns 12:48:33 Using Annoy for neighbor search, n_neighbors = 30 12:48:33 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 12:48:33 Writing NN index file to temp file /tmp/Rtmp7vOEqF/file43413dec1e75 12:48:33 Searching Annoy index using 1 thread, search_k = 3000 12:48:34 Annoy recall = 100% 12:48:35 Commencing smooth kNN distance calibration using 1 thread 12:48:38 Initializing from normalized Laplacian + noise 12:48:38 Commencing optimization for 500 epochs, with 193112 positive edges 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 12:48:43 Optimization finished seurat_1.combined <- RunTSNE(seurat_1.combined, reduction = "pca", dims = 1:15, seed.use = 1,

                             perplexity = 30, max_iter = 1000, theta = 0.5, eta = 200, num_threads = 0)

seurat_1.combined <- FindNeighbors(seurat_1.combined, reduction = "pca", dims = 1:15) Computing nearest neighbor graph Computing SNN seurat_1.combined <- FindClusters(seurat_1.combined, resolution = 1.0) Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 4545 Number of edges: 180873

Running Louvain algorithm... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Maximum modularity in 10 random starts: 0.8500 Number of communities: 14 Elapsed time: 0 seconds

seurat_2.combined <- RunUMAP(seurat_2.combined, reduction = "pca", dims = 1:15) 12:48:48 UMAP embedding parameters a = 0.9922 b = 1.112 12:48:48 Read 4545 rows and found 15 numeric columns 12:48:48 Using Annoy for neighbor search, n_neighbors = 30 12:48:48 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 12:48:49 Writing NN index file to temp file /tmp/Rtmp7vOEqF/file434159447fb 12:48:49 Searching Annoy index using 1 thread, search_k = 3000 12:48:49 Annoy recall = 100% 12:48:51 Commencing smooth kNN distance calibration using 1 thread 12:48:53 Initializing from normalized Laplacian + noise 12:48:53 Commencing optimization for 500 epochs, with 192840 positive edges 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| 12:48:58 Optimization finished seurat_2.combined <- RunTSNE(seurat_2.combined, reduction = "pca", dims = 1:15, seed.use = 1,

                              perplexity = 30, max_iter = 1000, theta = 0.5, eta = 200, num_threads = 0)

seurat_2.combined <- FindNeighbors(seurat_2.combined, reduction = "pca", dims = 1:15) Computing nearest neighbor graph Computing SNN seurat_2.combined <- FindClusters(seurat_2.combined, resolution = 1.0) Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 4545 Number of edges: 181213 Rplot01.pdf

Running Louvain algorithm... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Maximum modularity in 10 random starts: 0.8510 Number of communities: 14 Elapsed time: 0 seconds

a <- DimPlot(seurat_1.combined, reduction = "tsne", label = TRUE, pt.size = 1.0) b <- DimPlot(seurat_2.combined, reduction = "tsne", label = TRUE, pt.size = 1.0) a + b

Apr 29 '22 10:04 karouro

Rplot01.pdf

Apr 29 '22 10:04 karouro

seurat seurat copied to clipboard

Mismatch results between seurat data using ENSEBML gene id and mouse gene symbol

normalize and identify variable features for each dataset independently

select features that are repeatedly variable across datasets for integration run PCA on each dataset using these features

Seurat object include 2 assays "RNA" and "integrated". "integrated" is used for subsequent analysis

Standard workflow

Run unlinear dimension reduction and create cluster

seurat
seurat copied to clipboard