SpatialExperiment
SpatialExperiment copied to clipboard
imgData slot restrictions
Hi I was wondering why the imgData slot of a SpatialExperiment restricts to columns
‘sample_id’, ‘image_id’, ‘data’, ‘scaleFactor’
I can imagine many scenarios where one would might want to annotate additional metadata to the images, such as eg number of frames, number of channels, staining used, etc (would also say that scaleFactor is not necassarily one that one would always be interested in annotating).
> imgData(spe)
DataFrame with 2 rows and 4 columns
sample_id image_id data scaleFactor
<character> <character> <list> <numeric>
1 ileum dapi #### NA
2 ileum membrane #### NA
> imgData(spe)$nr.frames <- 7
Error in `imgData<-`(`*tmp*`, value = new("DFrame", rownames = NULL, nrows = 2L, :
'imgData' field in 'int_metadata' should have columns: ‘sample_id’, ‘image_id’, ‘data’, ‘scaleFactor’
Hi Ludwig @lgeistlinger ,
when we designed this class, we only referred to the 10xVisium data, which didn't need any other additional column.
Of course, right now there is plenty of other technologies and data around to test and extend our class with...
Can you provide any dataset to play with in order to better understand which columns would be useful to add?
Hi @drighelli - here are a couple of example datasets from other technologies than 10X Visium:
spatial transcript profiling
- seqFISH: https://bioconductor.org/packages/release/MouseGastrulationData
- MERFISH: https://bioconductor.org/packages/MerfishData
spatial protein profiling
- IMC data: https://bioconductor.org/packages/imcdatasets
- CyCIF data: https://github.com/ccb-hms/CyCIFData
But why actually restricting to specific columns at all and not allowing arbitrary metadata columns to images (as eg for SummarizedExperiments colData, rowData, and metadata)? Or alternatively decide on a set of core annotation columns that need to always be there, but allow additional annotation of arbitrary metadata columns on top of that (as eg for a GRanges that needs to have a chromosome, start and end position, and strand, but then allows to annotate arbitrary metadata columns on top of that). I think that would make for a flexible + extensible design as opposed to locking the slot to an exclusive set of metadata annotations to the images as we don't know what a user might want to annotate in the future.
I'm sorry @lgeistlinger, maybe I'm not getting what you mean ...
The imgData is designed to store images at the moment, so we designed it for storing images and metadata associated with them.
Of course, I get the idea to extend the DataFrame more flexibly, but what other kind of data, other than images, are you thinking of storing in the imgData?
If you're thinking about seqFISH and MERFISH processed raw data, you already have the BumpyMatrix accessible through the molecules accessor to store that kind of data, and of course, you can use the rowRanges instead of the rowData to store GRanges like information.
The only thing that comes into my mind could be another column named imgMetadata (or something else) where you can store another DataFrame/list with additional information to keep it flexible
The imgData is designed to store images at the moment, so we designed it for storing images and metadata associated with them.
Right, but given my example above the annotation of image metadata is restricted to sample_id, image_id, and scaleFactor. For my applications of interest, I could not annotate the number of frames/channels of an image, the type of the image (eg mask or raw image), the type of mask (nuclei or cell mask), the marker used for staining (eg DAPI, PolyA, or a cell membrane marker), segmentation algorithm used, etc etc. - this could be solved by allowing an arbitrary number of additional columns to the imgData DataFrame (if it's indeed represented as a DataFrame internally and not just a result of the show method).
okok, thanks for the clarification! :)