conos
conos copied to clipboard
Downsample from each cluster
Hi @evanbiederstedt ,
Is there an equivalent function to this in Seurat https://github.com/satijalab/seurat/issues/3116#issuecomment-640440296 within conos.
Hi @hiraksarkar
You'll have to write a method. There's more details here: https://github.com/satijalab/seurat/issues/3033
Note that subset()
is a S3 method, which you could modify for the package. For example of this, see https://github.com/kharchenkolab/conos/blob/main/R/access_wrappers.R
But yes, this will require a PR from you.
Thanks, Evan
Hi Evan,
Thanks for your input
Hirak
@hiraksarkar
I started writing up a function to do this, beginning with the normalized matrices (not cells in the cluster):
#' Applies downsampling uniformly to all samples in a valid Conos object.
#' Specify the number of cells you'd like to remain via downsampling for the samples within the Conos object.
#'
#' @param con conos object
#' @param number.of.cells numeric Number of cells to which to have remaining via downsampling. (Note: this is not the number of cells you'd like to remove, but the number of cells you'd like to have remaining.)
#' @return conos object with number of cells downsampled
#' @export
downsampleInputCells <- function(con, number.of.cells=NULL) {
'%ni%' <- Negate('%in%')
if ('Conos' %ni% class(con)) {
stop("Input 'con' not a valid Conos object. ")
}
if (length(con$samples)==0) {
stop("There are no samples in this Conos object to apply downsampling. ")
}
if (is.null(number.of.cells)) {
message("Number of cells not specified, returning Conos object without downsampling. ")
return(con)
}
if (!is.numeric(number.of.cells)) {
stop("Parameter 'number.of.cells' must be an integer ")
} else if (number.of.cells != as.integer(number.of.cells)) {
stop("Parameter 'number.of.cells' must be an integer ")
}
## Check that a sufficient number of cells exist in each sample before removing
## Iterate through list of samples. Check if Pagoda2 or Seurat.
## If Pagoda2, then access counts
for (i in 1:length(con$samples)) {
if ('Pagoda2' %in% class(con$samples[[i]])) {
sample = con$samples[[i]]
## number of cells in sample
cells_in_sample = dim(sample$counts)[1]
## Check that the number of cells is less than or equal to the 'number.of.cells' parameter
## If 'number.of.cells' is greater, than throw error
if (number.of.cells > cells_in_sample) {
stop(paste0("The sample ", con$samples[[i]], " has ", cells_in_sample, " number of cells. The parameter 'number.of.cells' specified is larger than the cells within the sample. Please correct this."))
}
subsample = sample(1:cells_in_sample, number.of.cells, replace=FALSE)
con$samples[[i]]$counts = sample$counts[subsample, ]
} else if ('Seurat' %in% class(con$samples[[i]])) {
message("Note: this function creates a new Seurat object with downsampled cells ")
sample = con$samples[[i]]
message("First checking that object is most recent version of Seurat")
sample = UpdateSeuratObject(sample)
assay_data = GetAssayData(sample)
## number of cells in sample
cells_in_sample = dim(assay_data)[2]
## Check that the number of cells is less than or equal to the 'number.of.cells' parameter
## If 'number.of.cells' is greater, than throw error
if (number.of.cells > cells_in_sample) {
stop(paste0("The sample ", sample, " has ", cells_in_sample, " number of cells. The parameter 'number.of.cells' specified is larger than the cells within the sample. Please correct this."))
}
subsample = sample(1:cells_in_sample, number.of.cells, replace=FALSE)
assay_data = assay_data[, subsample]
##new.seurat.object <- SetAssayData(object = sample , slot = "counts", new.data = assay_data)
## investigate how to update Seurat object...
con$samples[[i]] = CreateSeuratObject(assay_data)
}
}
}
But I've realized this is really not what you want. I think downsampling is always a bad idea. If you're removing cells for reasons other than QC, then I think it's a mistake.
On reflection, I think what you're trying to do is remove cells in the clusters for the heatmaps, correct? In that case, I think it's best to write a function modifying the heatmap for your purposes---play around with this:
https://github.com/jokergoo/ComplexHeatmap
(Also, the above function really shouldn't use a for-loop, which are bad in R. Try sccore::pbapply()
https://www.rdocumentation.org/packages/sccore/versions/0.1.1/topics/plapply )