seurat icon indicating copy to clipboard operation
seurat copied to clipboard

Xenium support

Open pmarks opened this issue 2 years ago • 2 comments

Preliminary Xenium support for Seurat. Opening this as a draft to get the discussion going. We are very open to suggestions about how to approach this. I'm also curious what general work in required in the feat/imaging branch before it can merge to develop. We should have some bandwidth to help with that.

Open questions:

  • [ ] transcripts.csv.gz loading speed - this seems to be the bottleneck for most datasets. We currently use read.csv, but could we try readr::read_csv or data.table::fread?
  • [ ] transcripts.csv.gz - probably want to make optional? Can get very large.
  • [ ] How to handle "Negative Control Probe" and "Negative Control Padlock" feature types. These are encoded as a separate feature_type in the matrix, but they are really technical controls of the in-situ RNA assay. They aren't useful for biological analysis, but we emit them to help users assess the specificity of the assay. Right now we're just dropping them in LoadXenium - should we provide an optional way to include them?
  • [ ] Morphology image - do we want to have the DAPI image available as a background, similar to how we have the H&E image for Visium.
  • [ ] refine version requirements: I don't think SeuratObject 4.1.1 is actually required. It's possible the sp 1.5 is required, but that version is specified in SeuratObject.

pmarks avatar Sep 19 '22 17:09 pmarks

Hi Pat, thanks for starting this! I'll try to respond to your questions as best as I can

  • loading speed: data.table::fread is the fastest out of the three and I would suggest using that over read.csv; we currently don't depend on data.table, but leave it as a suggested package. I would recommend the same Read/LoadXenium. One thing to note is that data.table::fread doesn't natively support gzipped files, so you'll also need R.utils to handle that (most people use a shell command to handle gzipped files in data.table::fread, but that is not portable to Windows)
  • transcripts.csv.gz being optional: all of our imaging Read/Load* functions support partial loads. For example, ReadVitessce can load the counts, coordinates, molecules, or any combination of the three. I would recommend a similar approach for Read/LoadXenium
  • negative probes: pinging @AustinHartman
  • morphology images: the FOV structure does not currently support backing images. This is on our list of things to support, but we do not have a timeline for this
  • version requirements: it's our practice that every new release of {Seurat} is pinned to the latest version of {SeuratObject}; when this goes to CRAN, the minimum version of {SeuratObject} will be updated to the latest CRAN version (currently 4.1.2)

mojaveazure avatar Sep 22 '22 16:09 mojaveazure

Regarding the negative probes, I think an optional way to include them would be beneficial for QC. Negative control probe counts and negative control padlock counts can be stored as additional assays in the Seurat object

AustinHartman avatar Sep 22 '22 20:09 AustinHartman

@mojaveazure @AustinHartman OK, I think this is ready for a closer review now. @jsicherman from 10x may also help get this in shape.

pmarks avatar Nov 09 '22 20:11 pmarks