feature: updates for cellranger 9.0
This PR adds support for cellranger 9.0 which performs automatic cell type annotation. Creates a simple loading function similar to Load10x_spatial for 10x single cell data which will also put cell annotations in the meta.data.
~~Changes to other .Rd files result from the roxygenize() function~~
decided to let @dcollins15 roxigenize on his end to get formatting and documentation that matches the rest of the repo
Thanks for this! I'm excited to try the automatic cell type calling in Cellranger 9.0. I'm not part of the developer team, but would you be open to some comments on this addition from a user's perspective?
Thanks for this! I'm excited to try the automatic cell type calling in Cellranger 9.0. I'm not part of the developer team, but would you be open to some comments on this addition from a user's perspective?
100%. Comments welcome.
@dcollins15 sorry for the little formatting edits. Not really sure where those were introduced. Feel free to fix and roxygenise and push directly to my branch as it seems what I'm going on my side is introducing little diffs.
@evolvedmicrobe I wanted to give you a heads up on this PR
This is looking great 🎉 Big thanks to @rharao for the code review 🙌
For these types of loader functions, I think the two main priorities should be:
- Backward compatibility
- Consistency
You've done a great job emulating the existing
Load10X_Spatialbut I wonder if the two functions can or should share more logic 🤔I’d also appreciate a bit more clarity on how Load10X is intended to relate to Read10X. Specifically:
- What versions of Cell Ranger output are compatible with each?
- As far as I can tell, Read10X hasn't been updated since just before the release of Cell Ranger v7.1.0.
I suspect that we could be doing a better job of making
Load10X_SpatialandLoadXeniummore ~mutually coherent~ cohesive but I'm not very familiar with the datatypes or output structure to propose specific suggestions.On a related note, what's the easiest way to access 10x datasets in their "canonical" structure? I typically download datasets from the 10x Portal but end up having to manually configure the
data.dirforSeurat. I assume there’s a better method that I’m just ignorant of.This leads nicely into my other concern: testing. Creating small, representative datasets for testing was one of the more tedious parts of the Visium HD updates. Any thoughts on streamlining this process would be great.
@stephenwilliams22, now that I’ve had a thorough look, I think it would be best to hold off on merging this until after the next release (v5.2.0).
I agree with most of this. We can definitely take the time to share code between all the 10x functions. Please let me know if you want to do that for this release.
As far as the specific code here
- What versions of Cell Ranger output are compatible with each?
- Both functions should be compatible will all versions of cellranger. The differences are
- Read10x will not put cell types in the metadata
- Both functions should be compatible will all versions of cellranger. The differences are
-
Read10X- As you mentioned, this is a very old function which doesn't take advantage of the "packaged" h5 matrix. Personally, I think that we might think of an end of life for Read10X. Not that 10x is going to get rid of the
filtered_feature_bc_matrixdirectory any times soon but new features will most certainly be focused on the h5 matrix
- As you mentioned, this is a very old function which doesn't take advantage of the "packaged" h5 matrix. Personally, I think that we might think of an end of life for Read10X. Not that 10x is going to get rid of the
With regard to testing datasets we are in talks internally about this. I'm going to try and make each release of the rangers come with a tiny dataset with the correct structure for testing.
Datasets on our public data portal should be in the structure of any normal ranger run. if they are not I'd like to hear about the issue so we can provide instruction or correct things on our side.