Create QFeatures object from any sets of features/peptides/proteins tables
Hi,
I'm sharing here a simple example that allows to exploit the data structure and functionalities of QFeatures starting from any pre-defined set of features/peptides/proteins. This is useful when one wants to import into R the summarised and aggregated output from any software, e.g. MaxQuant, PD etc... I wanted to test this as opposed to start from a PSMS table and then generating subsequent aggregations like explained in the QFeatures documentation.
Below I create the sample peptide and protein tables (I'll leave out the psms for simplicity). For both peptides and proteins I generate a matrix of "intensities" and the corresponding row data information with the respective peptides/proteins IDs. I generate a matrix of 10 peptides mapped to 3 proteins.
library(QFeatures)
wide_peptides <- matrix(c(rnorm(10), rnorm(10), rnorm(10)), nrow=10)
rownames(wide_peptides) <- paste0("Peptide",1:10)
colnames(wide_peptides) <- c("sample1", "sample2", "sample3")
rowdata_peptides <- DataFrame(PeptideID = rownames(wide_peptides),
Protein.id = c(rep("Protein1", 2), rep("Protein2", 4), rep("Protein3", 4)))
wide_proteins <- matrix(c(rnorm(3), rnorm(3), rnorm(3)), nrow = 3)
rownames(wide_proteins) <- paste0("Protein",1:3)
colnames(wide_proteins) <- c("sample1", "sample2", "sample3")
rowdata_proteins <- DataFrame(ProteinID = rownames(wide_proteins))
We can now create separate SummarizedExperiment objects for the peptides and proteins. This is the minimal information to create a QFeatures object that contains 2 unlinked assays. They are unlinked, meaning that I cannot, for example, directly subset both assays by requesting all features coming from a particular protein id.
se1 <- SummarizedExperiment(wide_peptides, rowdata_peptides)
se2 <- SummarizedExperiment(wide_proteins, rowdata_proteins)
## Sample annotation (colData)
cd <- DataFrame(row.names = colnames(wide_proteins))
el <- list(peptides = se1, proteins = se2)
hl <- QFeatures(el, colData = cd)
hl
An instance of class QFeatures containing 2 assays:
[1] peptides: SummarizedExperiment with 10 rows and 3 columns
[2] proteins: SummarizedExperiment with 3 rows and 3 columns
However, we can easily create a link between the assays using the protein ids by exploiting the QFeatures::addAssayLink which is applied under the hood automatically when creating aggregations with QFeatures::aggregateFeatures.
hl_linked <- addAssayLink(hl,
from = "peptides",
to = "proteins",
varFrom = "Protein.id",
varTo = "ProteinID")
Now the two assays are linked by the protein id and I can, for example, subset both assays simply by querying for one protein id and I can now use all the other functionalities of QFeatures
protein_example <- hl_linked["Protein1", , ]
protein_example
An instance of class QFeatures containing 2 assays:
[1] peptides: SummarizedExperiment with 2 rows and 3 columns
[2] proteins: SummarizedExperiment with 1 rows and 3 columns
Thanks a lot to @lgatto for providing the support and initial simple example to get me started with and to all the developer of the package!
Anna