harmony icon indicating copy to clipboard operation
harmony copied to clipboard

Getting dataset ID in the object's metadata

Open GinaCommin opened this issue 3 years ago • 1 comments

I am trying to use Harmony in Seurat v3, integrating 8 10x datasets (4 case, 4 control). I am having issues ensuring that the datasetID is in the object metadata. My script is below:

#read in 10x data

sample1.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix") sample2.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix") sample3.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix") sample4.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix") sample5.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix") sample6.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix") sample7.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix") sample8.data <- Read10X(data.dir = "path/filtered_feature_bc_matrix")

#Initialize the Seurat object with the raw (non-normalized) data. Minimum of 200 genes per cell- create one seurat object

All.combined <- CreateSeuratObject(counts = cbind(sample1.data, sample2.data, sample3.data, sample4.data, sample5.data, sample6.data, sample7.data, sample8.data), project = "project1", min.cells = 3, min.features = 200)

#Add dataset ID to the object's metadata

[email protected]$case <- c(rep("case", ncol(sample1.data)), rep("control", ncol(sample2.data)), rep("case", ncol(sample3.data)), rep("control", ncol(sample4.data)), rep("case", ncol(sample5.data)), rep("control", ncol(sample6.data)), rep("case", ncol(sample7.data)), rep("control", ncol(sample8.data)))

When I run that code I get the following error:

Error in $<-.data.frame(*tmp*, case, value = c("case", "case", : replacement has 8690 rows, data has 8685

Any help with how to resolve this issue would be much appreciated!

GinaCommin avatar Oct 20 '21 14:10 GinaCommin

When you create your Seurat object, you specify min.features = 200. That means any cells in your count matrices that have less than 200 features will be subset. If you remove this, your line of code will work fine. The error is coming from the fact you're using count matrices without the subsetting (all 8690 cells) to add to the meta data, but the seurat object you've made only contains 8685 cells, so it throws an error.

cswoboda avatar Nov 08 '21 19:11 cswoboda