conos icon indicating copy to clipboard operation
conos copied to clipboard

Downsample from each cluster

Open hiraksarkar opened this issue 2 years ago • 3 comments

Hi @evanbiederstedt ,

Is there an equivalent function to this in Seurat https://github.com/satijalab/seurat/issues/3116#issuecomment-640440296 within conos.

hiraksarkar avatar Dec 23 '21 17:12 hiraksarkar

Hi @hiraksarkar

You'll have to write a method. There's more details here: https://github.com/satijalab/seurat/issues/3033

Note that subset() is a S3 method, which you could modify for the package. For example of this, see https://github.com/kharchenkolab/conos/blob/main/R/access_wrappers.R

But yes, this will require a PR from you.

Thanks, Evan

evanbiederstedt avatar Dec 23 '21 18:12 evanbiederstedt

Hi Evan,

Thanks for your input

Hirak

hiraksarkar avatar Dec 24 '21 04:12 hiraksarkar

@hiraksarkar

I started writing up a function to do this, beginning with the normalized matrices (not cells in the cluster):

#' Applies downsampling uniformly to all samples in a valid Conos object. 
#' Specify the number of cells you'd like to remain via downsampling for the samples within the Conos object. 
#'
#' @param con conos object
#' @param number.of.cells numeric Number of cells to which to have remaining via downsampling. (Note: this is not the number of cells you'd like to remove, but the number of cells you'd like to have remaining.)
#' @return conos object with number of cells downsampled
#' @export
downsampleInputCells <- function(con, number.of.cells=NULL) {
  '%ni%' <- Negate('%in%')
  if ('Conos' %ni% class(con)) {
    stop("Input 'con' not a valid Conos object. ")
  }
  if (length(con$samples)==0) {
    stop("There are no samples in this Conos object to apply downsampling. ")
  }
  if (is.null(number.of.cells)) {
    message("Number of cells not specified, returning Conos object without downsampling. ")
    return(con)
  }
  if (!is.numeric(number.of.cells)) {
    stop("Parameter 'number.of.cells' must be an integer ")
  } else if (number.of.cells != as.integer(number.of.cells)) {
    stop("Parameter 'number.of.cells' must be an integer ")
  }
  ## Check that a sufficient number of cells exist in each sample before removing
  ## Iterate through list of samples. Check if Pagoda2 or Seurat. 
  ## If Pagoda2, then access counts
  for (i in 1:length(con$samples)) {
    if ('Pagoda2' %in% class(con$samples[[i]])) {
      sample = con$samples[[i]]
      ## number of cells in sample
      cells_in_sample = dim(sample$counts)[1]
      ## Check that the number of cells is less than or equal to the 'number.of.cells' parameter
      ## If 'number.of.cells' is greater, than throw error
      if (number.of.cells > cells_in_sample) {
        stop(paste0("The sample ", con$samples[[i]], " has ", cells_in_sample, " number of cells. The parameter 'number.of.cells' specified is larger than the cells within the sample. Please correct this."))
      }
      subsample = sample(1:cells_in_sample, number.of.cells, replace=FALSE)
      con$samples[[i]]$counts = sample$counts[subsample, ]
    } else if ('Seurat' %in% class(con$samples[[i]])) {
      message("Note: this function creates a new Seurat object with downsampled cells ")
      sample = con$samples[[i]]
      message("First checking that object is most recent version of Seurat")
      sample = UpdateSeuratObject(sample)
      assay_data = GetAssayData(sample)
      ## number of cells in sample
      cells_in_sample = dim(assay_data)[2]
      ## Check that the number of cells is less than or equal to the 'number.of.cells' parameter
      ## If 'number.of.cells' is greater, than throw error
      if (number.of.cells > cells_in_sample) {
        stop(paste0("The sample ", sample, " has ", cells_in_sample, " number of cells. The parameter 'number.of.cells' specified is larger than the cells within the sample. Please correct this."))
      }
      subsample = sample(1:cells_in_sample, number.of.cells, replace=FALSE)
      assay_data = assay_data[, subsample]
      ##new.seurat.object <- SetAssayData(object = sample , slot = "counts", new.data = assay_data)
      ## investigate how to update Seurat object...
      con$samples[[i]] = CreateSeuratObject(assay_data)
    }
  }
}

But I've realized this is really not what you want. I think downsampling is always a bad idea. If you're removing cells for reasons other than QC, then I think it's a mistake.

On reflection, I think what you're trying to do is remove cells in the clusters for the heatmaps, correct? In that case, I think it's best to write a function modifying the heatmap for your purposes---play around with this:

https://github.com/jokergoo/ComplexHeatmap

(Also, the above function really shouldn't use a for-loop, which are bad in R. Try sccore::pbapply() https://www.rdocumentation.org/packages/sccore/versions/0.1.1/topics/plapply )

evanbiederstedt avatar Dec 25 '21 16:12 evanbiederstedt