clustifyr icon indicating copy to clipboard operation
clustifyr copied to clipboard

Classify cells using known marker genes

Open sagarutturkar opened this issue 2 years ago • 1 comments

I have subset my Seurat object to select the cells from specific clusters (3 and 14). Then I exported the matrix and metadata as required by clustifyr.

See the code below and outputs:

ref_mat = readRDS(file = "ref_mat_3.14.rds")
ref_mat = as.matrix(ref_mat)

metadata = readRDS(file = "metadata_3.14.rds")


dim(ref_mat)
# [1] 20326  4143
dim(metadata)
# [1] 4143    9

ref_mat[1:5,1:5]
#       AAACGAAAGCTGGTGA-1_1 AAACGCTAGAGCAGTC-1_1 AAACGCTCAAATAGCA-1_1 AAACGCTCACCCGTAG-1_1 AAAGGTAGTTGCATTG-1_1
#Sox17                     0                    0                    0                    0                    0
#Mrpl15                    0                    0                    0                    0                    0
#Lypla1                    0                    0                    0                    0                    0
#Tcea1                     0                    0                    1                    1                    1
#Rgs20                     0                    0                    0                    0                    0

head(dplyr::select(metadata, all_of(c("seurat_clusters"))))

#                     seurat_clusters
#AAACGAAAGCTGGTGA-1_1               3
#AAACGCTAGAGCAGTC-1_1               3
#AAACGCTCAAATAGCA-1_1               3
#AAACGCTCACCCGTAG-1_1               3
#AAAGGTAGTTGCATTG-1_1               3
#AAAGGTAGTTTACGAC-1_1               3


t = dplyr::select(metadata, all_of(c("seurat_clusters")))

table(t$seurat_clusters)
#   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18 
#   0    0    0 3140    0    0    0    0    0    0    0    0    0    0 1003    0    0    0    0 

Next, i defined my table of marker genes and get an error as:

list_res <- clustify_lists(
  input = ref_mat,             # matrix of normalized single-cell RNA-seq counts
  metadata = metadata,            # meta.data table containing cell clusters
  cluster_col = "seurat_clusters", # name of column in meta.data containing cell clusters
  marker = m_list,                 # list of known marker genes
  metric = "pct"                   # test to use for assigning cell types
)

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 17, 14, 10, 11, 7

Can you please help with this?

sagarutturkar avatar Jan 14 '22 22:01 sagarutturkar

hi, 2 things i can think of:

  1. what does m_list look like?
  2. directly pass metadata = as.character(metadata$seurat_clusters) and leave out cluster_col =. We have not tested what happens when factors have count 0s.

raysinensis avatar Feb 03 '22 16:02 raysinensis