xcms
xcms copied to clipboard
Analyze MRM data
I'm working with MRM data aquired on a Waters instrument and exported into a .mzML format. I try to import them using xcms using the readMSData() function with no luck:
fls <- list.files(path = "quattroMicro/", pattern = ".mzML$",
full.names = TRUE)
data <- readMSData(fls, mode = "onDisk")
> chr <- chromatogram(data)
Warning message:
In .extractMultipleChromatograms(object, rt = rt, mz = mz, aggregationFun = aggregationFun, :
No MS 1 data present.
> chr
Chromatograms with 0 rows and 18 columns
phenoData with 0 variables
featureData with 0 variables
> chr <- chromatogram(data, mz = 823)
Warning message:
In .extractMultipleChromatograms(object, rt = rt, mz = mz, aggregationFun = aggregationFun, :
No MS 1 data present.
>
Seems that readMSData() is not able to correctly import MRM data. I read on MSnbase manual about the readSRMData() function to import MRM/SRM data. It seems to work and correctly import my data:
## import data
mrm <- readSRMData(fls)
> mrm
Chromatograms with 8 rows and 18 columns
1 2 3 4 5
<Chromatogram> <Chromatogram> <Chromatogram> <Chromatogram> <Chromatogram>
[1,] length: 495 length: 495 length: 495 length: 495 length: 495
[2,] length: 495 length: 495 length: 495 length: 495 length: 495
... ... ... ... ... ...
[7,] length: 284 length: 284 length: 284 length: 284 length: 284
[8,] length: 284 length: 284 length: 284 length: 284 length: 284
phenoData with 1 variables
featureData with 10 variables
I would like to manipulate (integrate, align) data using xcms but it seems readSRMData() class is not compatible with xcms functions:
chrs <- chromatograms(mrm)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘chromatograms’ for signature
‘"Chromatograms"’
> class(mrm)
[1] "Chromatograms"
attr(,"package")
[1] "MSnbase"
> class(data)
[1] "OnDiskMSnExp"
attr(,"package")
[1] "MSnbase"
Is there a way to work with readSRMData() with xcms()?
You are on the right track @rromoli . The readSRMData actually returns a Chromatograms object (the same type of object you would get by calling chromatogram on an MSnExp/OnDiskMSnExp object containing spectra data).
You should be able to directly call findChromPeaks on the mrm object you have. This will return you a XChromatograms object (defined in xcms) that contains then also the identified chromatographic peaks (which you can access with chromPeaks). Note that I've also implemented a groupChromPeaks method for XChromatgrams, but no adjustRtime method.
Regarding alignment, I've implemented only an alignment method that allows to align a single Chromatogram object against another one - but nothing yet for Chromatograms (note the s).
findChromPeaks() works fine using MatchedFilterParam(), but I noticed that the
fwhm parameter have a strange behaviour. I need to divide the value 10 times. So
the value I used is fwhm/10 (0.4) otherwise it integrate too much base line...
Furthermore I do not understand how to extract data. I mean:
> featureValues(peaks, value = "into")
1 2 3 4 5 6 7
FT01 NA NA NA 271.50840 225.47668 201.51727 862.14525
FT02 NA NA NA NA NA NA 215.32279
FT03 140.49033 107.46131 87.04114 143.76471 118.99295 113.02374 186.81566
FT04 53.75822 48.75521 42.22618 58.85781 55.24559 46.26336 109.39015
FT05 NA NA NA 30.84873 NA NA 58.27840
FT06 NA NA NA NA NA NA NA
FT07 NA NA NA 46.28275 68.67077 63.43596 143.92853
FT08 NA NA NA NA NA NA NA
FT09 NA NA NA NA NA NA NA
FT10 NA NA NA NA NA NA 68.55487
In this way I extract the integrated signals but I have no idea what FTXX stand for.
If I use the precursorMz() and productMz() functions I see that I have 8 SRM
transitions in my dataset. Why in the results I have 10 features? I try to use featureDefinitions()
> featureDefinitions(peaks)
DataFrame with 10 rows and 15 columns
mzmed mzmin mzmax rtmed rtmin
<numeric> <numeric> <numeric> <numeric> <numeric>
FT01 NA NA NA 1.64795005321503 1.62013328075409
FT02 NA NA NA 1.66193330287933 1.64795005321503
FT03 NA NA NA 4.00483322143555 3.76771664619446
FT04 NA NA NA 11.7689828872681 11.7496662139893
FT05 NA NA NA 9.60551643371582 9.58619976043701
FT06 NA NA NA 10.1463832855225 10.1463832855225
FT07 NA NA NA 9.60551643371582 9.58619976043701
FT08 NA NA NA 10.1463832855225 10.1463832855225
FT09 NA NA NA 10.1463832855225 10.1270666122437
FT10 NA NA NA 10.1270666122437 10.1270666122437
but the function return no mz values.
How can I interpret the results?
Actually, you're the first user of this functionality! I've never analyzed MRM data (or had any MRM files available for testing). The FTXX is just an arbitrary feature identifier. The whole functionality works in a similar way as if you had LC-MS data, it does first chromatographic peak detection separately for each chromatogram (MRM) and then it uses the chromPeaks matrix to group peaks across samples. I could imagine that you have more features than MRM because maybe in some of the chromatograms more than one peak was identified?
would it be possible for you to share some files with me so that I could look into what's happening?
I could imagine that you have more features than MRM because maybe in some of the chromatograms more than one peak was identified?
Yes, it seem that I have two interfering ions...
would it be possible for you to share some files with me so that I could look into what's happening?
Yes of course, how can we share? If you give to me your email I will share it with gdrive.
Thanks for the data! To get the information about the transision for the individual features you can do the following (variable peaks is your Chromatograms object after peak detection and correspondence analysis):
fdev <- featureDefinitions(peaks)
fdev <- fdev[, colnames(fdev) != "peakidx"]
fdev
DataFrame with 10 rows and 14 columns
mzmed mzmin mzmax rtmed rtmin
<numeric> <numeric> <numeric> <numeric> <numeric>
FT01 NA NA NA 1.64795005321503 1.62013328075409
FT02 NA NA NA 1.66193330287933 1.64795005321503
... ... ... ... ... ...
FT09 NA NA NA 10.1463832855225 10.1270666122437
FT10 NA NA NA 10.1270666122437 10.1270666122437
rtmax npeaks P0 P1 P2 P3
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
FT01 1.67591667175293 15 0 3 3 3
FT02 1.68990004062653 12 0 0 3 3
... ... ... ... ... ... ...
FT09 10.1463832855225 7 0 0 1 3
FT10 10.1463832855225 12 0 0 3 3
P4 P5 row
<numeric> <numeric> <integer>
FT01 3 3 1
FT02 3 3 2
... ... ... ...
FT09 0 3 7
FT10 3 3 8
In the featureDefinitions there is a column "row" that tells you in which of the rows (transitions) the feature was defined. You can add the actual precursor and product m/z with:
## Add the precursorMz and productMz to the annotation.
fdev$precursorMz <- rowMeans(precursorMz(peaks))[fdev$row]
fdev$productMz <- rowMeans(productMz(peaks))[fdev$row]
And to get the feature intensities:
fvals <- featureValues(peaks, value = "into")
Each row in fdev provides now the feature annotations for the corresponding row in fvals.
Hope it is a little clearer now. Let me know if not.
Hi @rromoli , if Johannes' suggestion works for you, it would be great if you could turn that into an MRM vignette. For that we'll need representative data (but could also be measurements of QC samples, no science required), and the script plus some explanations with it. Would that make sense ? Yours, Steffen
Ok @sneumann I will try to write a vignette about the use of xcms with MRM data!
Ok @sneumann I will try to write a vignette about the use of xcms with MRM data!
Hi, @rromoli may I know if you solve the mrm data import issue by now?
Thanks for the data! To get the information about the transision for the individual features you can do the following (variable
peaksis yourChromatogramsobject after peak detection and correspondence analysis):fdev <- featureDefinitions(peaks) fdev <- fdev[, colnames(fdev) != "peakidx"] fdev DataFrame with 10 rows and 14 columns mzmed mzmin mzmax rtmed rtmin <numeric> <numeric> <numeric> <numeric> <numeric> FT01 NA NA NA 1.64795005321503 1.62013328075409 FT02 NA NA NA 1.66193330287933 1.64795005321503 ... ... ... ... ... ... FT09 NA NA NA 10.1463832855225 10.1270666122437 FT10 NA NA NA 10.1270666122437 10.1270666122437 rtmax npeaks P0 P1 P2 P3 <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> FT01 1.67591667175293 15 0 3 3 3 FT02 1.68990004062653 12 0 0 3 3 ... ... ... ... ... ... ... FT09 10.1463832855225 7 0 0 1 3 FT10 10.1463832855225 12 0 0 3 3 P4 P5 row <numeric> <numeric> <integer> FT01 3 3 1 FT02 3 3 2 ... ... ... ... FT09 0 3 7 FT10 3 3 8In the
featureDefinitionsthere is a column"row"that tells you in which of the rows (transitions) the feature was defined. You can add the actual precursor and product m/z with:## Add the precursorMz and productMz to the annotation. fdev$precursorMz <- rowMeans(precursorMz(peaks))[fdev$row] fdev$productMz <- rowMeans(productMz(peaks))[fdev$row]And to get the feature intensities:
fvals <- featureValues(peaks, value = "into")Each row in
fdevprovides now the feature annotations for the corresponding row infvals.Hope it is a little clearer now. Let me know if not.
Hi @jorainer , I wonder if there are mrm data processing functions inside xcms now?
There's nothing specifically for MRM data, except that you can read the data as a MChromatograms object and then perform chromatographic peak detection in each chromatogram (using the findChromPeaks function), you can also perform a correspondence analysis (using groupChromPeaks). In addition there is functionality to filter, plot and subset the chromatographic data.
There's nothing specifically for MRM data, except that you can read the data as a
MChromatogramsobject and then perform chromatographic peak detection in each chromatogram (using thefindChromPeaksfunction), you can also perform a correspondence analysis (usinggroupChromPeaks). In addition there is functionality to filter, plot and subset the chromatographic data.
@jorainer @rromoli @sneumann hi, I am trying to exporting MRM data from waters QQQ. However, the mzML file converted by MSconvert is not readable by readSRMdata(). Is there any way to work around this problem? please kindly find my error message.
Original error was: Error in pwizModule$open(filename): [IO::HandlerBinaryDataArray] Missing binary data type.
Best regards, Junjie
That is actually a problem with mzR and more recent versions of proteowizard. Maybe try with the suggestions from this issue https://github.com/lgatto/MSnbase/issues/551 . In the longer run we hope to manage updating mzR to include a newer version of proteowizard, but at present the workaround is to skip some data in the msconvert conversion to mzML files.
Hi, thanks a lot for your reply. I tried with the command line, not working tho. please kindly see the command line and the r output.
D:\proteowizard>msconvert test.RAW --chromatogramFilter "index [2,]" format: mzML m/z: Compression-None, 64-bit intensity: Compression-None, 32-bit rt: Compression-None, 64-bit ByteOrder_LittleEndian indexed="true" outputPath: . extension: .mzML contactFilename: runIndexSet:
spectrum list filters:
chromatogram list filters: index [2,]
filenames: test.raw
processing file: test.raw calculating source file checksums writing output file: .\test.mzML
mrm <- readSRMData(fls2) Error: Can not open file D:\zhengjie_project\MRM_pipline\test.mzML! Original error was: Error in pwizModule$open(filename): [IO::HandlerBinaryDataArray] Unknown binary data type. mrm_cmd <- readMSData(fls2) Error: Can not open file D:\zhengjie_project\MRM_pipline\test.mzML! Original error was: Error in pwizModule$open(filename): [IO::HandlerBinaryDataArray] Unknown binary data type.
and this is my converted .mzMLfile. test.zip
Regards, Junjie
Seems that the converted file only contains a single chromatogram entry - which is the TIC (with the "non-standard data array" in it) - I guess the original file contains more chromatograms?
We'll try to update the mzR package to include the new proteowizard code base - that should solve all problems but I can not guarantee when it will be available.
@jorainer yes. The original ".raw" file contains several MRM transitions. Please kindly find the txt file converted by the msconvert. I can find the chromatograms inside the text. so I am wondering if we could find a workaround using this text format.
The developmental mzR version with an updated proteowizard code is available. With this version it should be possible to read the mzML files. It might take some time until this version becomes "stable" because we had to remove the ramp backend and hence mzData support. To install:
BiocManager::install("sneumann/mzR", ref = "feature/updatePwiz_3_0_21263")
Noted with thanks!
The developmental
mzRversion with an updated proteowizard code is available. With this version it should be possible to read the mzML files. It might take some time until this version becomes "stable" because we had to remove therampbackend and hence mzData support. To install:BiocManager::install("sneumann/mzR", ref = "feature/updatePwiz_3_0_21263")
@jorainer hi, I also curious about how to achieve the peak alignments for the mchromatograms object successfully. My mrm data was imported by readSRMData, which resulted in Mchromatograms format. Therefore, I was not able to do the alignments for my data.
There is no alignment method as we have for XCMSnExp (i.e. spectra data) available for the chromatographic data. What is available is the findChromPeaks method that allows to identify chromatographic peaks and then also the groupChromPeaks method to group chromatographic peaks across samples (have a look a the XChromatograms help for more details ?XChromatograms).
The only alignment method which is available for MChromatograms is alignRt which allows to align an MChromatograms (i.e. chromatographic data across multiple samples) against a single Chromatogram object. But I'm not sure if that's what you're looking for.
Hi Jorainer, I found some issues after I made peak picking on the chromatogram object of MRM data read by readSRMData.
- peaks(y) after alignment function alignRT ended up as the copy chromatogram of example chromatogram(x)
- findChrompeak function with "MatchedFilterParam" was not able to detect peaks correctly on my data and failed to pick up two peaks in one chromatogram object.
Please kindly find my example data herein. E4-1.zip
Could you please add here also the R code you used to perform this analysis. Without that it's impossible to replicate and find out what your problems might be.
Hi, thanks a lot for your reply! Please kindly find the attached code herein:
std <- "E4-1.mzML" std1 <- readSRMData(std) chr1 <- std1[1,] mfp <- MatchedFilterParam( binSize = 0.1, snthresh = 0, ) xchr1 <- findChromPeaks(chr1, mfp)
Hi Jorainer, I found some issues after I made peak picking on the chromatogram object of MRM data read by readSRMData.
- peaks(y) after alignment function alignRT ended up as the copy chromatogram of example chromatogram(x)
- findChrompeak function with "MatchedFilterParam" was not able to detect peaks correctly on my data and failed to pick up two peaks in one chromatogram object.
Please kindly find my example data herein. E4-1.zip
I find out the way to pick up small side picks by adjusting the fwhm value (from 0~5) for my first questions. I am still trying to find out a good way to solve the second question.
Since the peaks are quite different (the first one broader the second quite narrow) I would suggest to use centWave instead of matchedFilter:
cwp <- CentWaveParam(peakwidth = c(1, 4))
tmp <- findChromPeaks(chr1, param = cwp)
plot(tmp)
this identifies both peaks:
Thanks a lot!!
Hi @jorainer , regarding my first question about the retention time correction. May I know if there is any way to get a modified function for retention time correction for mrm data?
At present we don't have a dedicated function to do a retention time alignment on MRM data (similar to what is available for spectra-based LC-MS data). For chromatograms with a single peak it should in theory also suffice to use a rather large bw parameter in groupChromPeaks with PeakDensityParam which will then also group chromatographic peaks into the same feature even if their retention times are different.
We might implement some functionality, but at present we unfortunately don't have the capacity/manpower to do that. What would however help later is to get hands on example MRM data files with peaks that need to be aligned...
My chromatograms come with multiple peaks. I wish to make an alignment across samples before I group any peaks and continue with the downstream analysis. Currently, I try to find a workaround for this issue. Thanks for your help too!
Hi all,
I want to share my experience with SRM data and xcms:
An assay on a QqQ creates SRM data with 30 transitions. Two of them detect two isobaric, closely eluting compounds. The attached ZIP file contains a RDS file of those two transitions as MChromatograms.
If I plot this I get:
graph.pdf
Then I do
xdata<-findChromPeaks(srm_selected[8,6], param = cwp)
and
chromPeaks(xdata) rt rtmin rtmax into intb maxo sn [1,] 7.302067 6.424583 8.231167 126.534 31.3895 207.1527 32 [2,] 14.941333 13.547683 15.818817 3968.052 3849.5452 14944.2097 5002
shows that the two large peaks have been detected as one wide peak at 14.94 min.
Doing the peak detection with MatchedFilterParam shows the same behaviour. I've tried around but can not find settings for either that would detect the two peaks as individuals.
Now if I use do_findPeaks_MSW I get both peaks as individuals:
int<-intensity(srm[2,]) rt<-rtime(srm[2,]) do_findPeaks_MSW(rt,int,snthresh = 1,scales=1:10) mz mzmin mzmax rt rtmin rtmax into maxo sn intf maxf [1,] 14.27032 14.11547 14.37355 -1 -1 -1 28765.37 12087.74 35.86637 NA 12216.63 [2,] 14.94133 14.78648 15.04457 -1 -1 -1 40612.61 14944.21 49.98661 NA 17026.19
Peak apex and boundaries are well enough defined.
I am wondering now: MSW and centwave both use the MassSpecWavelet functionalities. Why are they delivering such different results. Using findChromPeaks with centwave would be so much more comfortable on MChromatograms but I think I can do with do_findPeaks_MSW.
Cheers Andreas
Forget what I wrote above. Reading through some other issues I realized that my SRM data is loaded with rtime in minutes. Thus using peakwidth(cwp)<-c(1,10) is way too large. With peakwidth(cwp)<-c(0.017,0.17) I do get individual peak detection.
My bad :-)