xcms icon indicating copy to clipboard operation
xcms copied to clipboard

readMSdata generate scan error on Mac but not Windows

Open IloveAphid opened this issue 2 years ago • 8 comments

I am trying to run xcms on Mac, however, I am stuck at the first step to import my data. This error does not show up for the example data faahKO, and it is not a problem when I run the code on Windows system. The error message is shown below: Error in object@backend$getScanHeaderInfo(scans) : [SpectrumList_mzXML::spectrum()] Error seeking to .

The code I am using: if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("faahKO") BiocManager::install("pheatmap")

library(xcms) library(faahKO) library(RColorBrewer) library(pander) library(magrittr) library(pheatmap) library(SummarizedExperiment)

specify raw data folders and working directory

path_to_raw_data <- "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/" ### For your own raw data files, specify the folder all_raw_files <- list.files(path_to_raw_data, recursive=T, full.names=T) setwd("/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/") # set up the working directory

get file path and names

pd <- data.frame(sample_name=sub(basename(all_raw_files), ## Create a phenodata data.frame pattern = ".mzXML", replacement = "", fixed = TRUE), sample_group = c(rep("Ctrl-Flower", 8), rep("JA-Flower", 8)), stringsAsFactors = FALSE)

raw_data <- readMSData(files=all_raw_files, # read raw data pdata=new("NAnnotatedDataFrame", pd), mode="onDisk")

The sessioninfo(): R version 4.1.3 (2022-03-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3.1

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] magrittr_2.0.3 xcms_3.17.6 MSnbase_2.21.6 ProtGenerics_1.27.2 S4Vectors_0.32.4 mzR_2.29.5
[7] Rcpp_1.0.8.3 Biobase_2.54.0 BiocGenerics_0.40.0 BiocParallel_1.28.3

loaded via a namespace (and not attached): [1] lattice_0.20-45 digest_0.6.29 foreach_1.5.2 utf8_1.2.2
[5] R6_2.5.1 GenomeInfoDb_1.30.1 plyr_1.8.7 mzID_1.32.0
[9] ggplot2_3.3.5 pillar_1.7.0 zlibbioc_1.40.0 rlang_1.0.2
[13] Matrix_1.4-1 preprocessCore_1.56.0 RCurl_1.98-1.6 munsell_0.5.0
[17] DelayedArray_0.20.0 compiler_4.1.3 MsFeatures_1.2.0 pkgconfig_2.0.3
[21] pcaMethods_1.86.0 SummarizedExperiment_1.24.0 tibble_3.1.6 GenomeInfoDbData_1.2.7
[25] RANN_2.6.1 IRanges_2.28.0 codetools_0.2-18 matrixStats_0.62.0
[29] XML_3.99-0.9 fansi_1.0.3 crayon_1.5.1 MASS_7.3-56
[33] bitops_1.0-7 MassSpecWavelet_1.60.1 grid_4.1.3 gtable_0.3.0
[37] lifecycle_1.0.1 affy_1.72.0 MsCoreUtils_1.6.2 scales_1.2.0
[41] ncdf4_1.19 cli_3.2.0 impute_1.68.0 XVector_0.34.0
[45] affyio_1.64.0 doParallel_1.0.17 limma_3.50.3 robustbase_0.95-0
[49] ellipsis_0.3.2 vctrs_0.4.1 RColorBrewer_1.1-3 iterators_1.0.14
[53] tools_4.1.3 glue_1.6.2 DEoptimR_1.0-11 MatrixGenerics_1.6.0
[57] parallel_4.1.3 clue_0.3-60 colorspace_2.0-3 cluster_2.1.3
[61] BiocManager_1.30.16 vsn_3.62.0 GenomicRanges_1.46.1 MALDIquant_1.21

Thank you very much. I very appreciate any suggestions or help you may have.

IloveAphid avatar Apr 20 '22 02:04 IloveAphid

Hi, what files do you get in all_raw_files ? What does your pd look like ? Yours, Steffen

sneumann avatar Apr 20 '22 07:04 sneumann

Hi, what files do you get in all_raw_files ? What does your pd look like ? Yours, Steffen

Hi Steffen, I got 16 mzXML files for the all_raw_files, 8 are Ctrl and 8 are JA treated samples: [1] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P1.mzXML" [2] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P2.mzXML" [3] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P3.mzXML" [4] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P4.mzXML" [5] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P5.mzXML" [6] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P6.mzXML" [7] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P7.mzXML" [8] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/Ctrl-Flower-P8.mzXML" [9] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P1.mzXML"
[10] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P2.mzXML"
[11] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P3.mzXML"
[12] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P4.mzXML"
[13] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P5.mzXML"
[14] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P6.mzXML"
[15] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P7.mzXML"
[16] "/Users/honglin/Desktop/LCMS_data/JAvsCtrl/Sample/Flower/JA-Flower-P8.mzXML"

Below is what the pd looks like: sample_name sample_group 1 Ctrl-Flower-P1 Ctrl-Flower 2 Ctrl-Flower-P2 Ctrl-Flower 3 Ctrl-Flower-P3 Ctrl-Flower 4 Ctrl-Flower-P4 Ctrl-Flower 5 Ctrl-Flower-P5 Ctrl-Flower 6 Ctrl-Flower-P6 Ctrl-Flower 7 Ctrl-Flower-P7 Ctrl-Flower 8 Ctrl-Flower-P8 Ctrl-Flower 9 JA-Flower-P1 JA-Flower 10 JA-Flower-P2 JA-Flower 11 JA-Flower-P3 JA-Flower 12 JA-Flower-P4 JA-Flower 13 JA-Flower-P5 JA-Flower 14 JA-Flower-P6 JA-Flower 15 JA-Flower-P7 JA-Flower 16 JA-Flower-P8 JA-Flower

Thank you very much. Best Honglin

IloveAphid avatar Apr 20 '22 12:04 IloveAphid

The error seems to come from proteowizard so there might be some issue with one of your files maybe?

Could you try the following:

for (f in all_raw_files) {
    message("reading file ", basename(f))
    tmp <- readMSData(f, mode = "onDisk")
}

At least that way you could find out which file is problematic and we could proceed from there...

jorainer avatar Apr 20 '22 13:04 jorainer

The error seems to come from proteowizard so there might be some issue with one of your files maybe?

Could you try the following:

for (f in all_raw_files) {
    message("reading file ", basename(f))
    tmp <- readMSData(f, mode = "onDisk")
}

At least that way you could find out which file is problematic and we could proceed from there...

Hi jorainer,

I have run the code you provided. It can read through all my Ctrl- files, but not any of the JA- files, generating the exact error message as above. However, those JA- samples were run and raw files were converted to mzXML files at the same time with exactly the same settings. Just to point that those files are with no problems when I run the same code on a Window system. What could be the exact proteowizard problem?

Thank you very much. Best Honglin

IloveAphid avatar Apr 20 '22 15:04 IloveAphid

Can you provide one of the problematic files ? If needed by private mail to me ? Yours, Steffen

sneumann avatar Apr 20 '22 16:04 sneumann

Attachments available until May 20, 2022 Hi Steffen,

Please see attached a couple of my files, including a good Ctrl- and a problematic JA- files. Thank you very much for your help.

Best regards, Honglin

Click to Download https://www.icloud.com/attachment/?u=https%3A%2F%2Fcvws.icloud-content.com%2FB%2FAbd00P0dDwsi79oKvt4w02oowxeZAUqurwBTXFbM68FuY5mcy3DJUYPy%2F%24%7Bf%7D%3Fo%3DAocMpVahg8QqO7xfsuZ0uZVBDY2-wGtZyqrREhG7xjKg%26v%3D1%26x%3D3%26a%3DCAogHPx_V2H9O0z0eOqa1OLXgFzG7OJ9L7C7z5tAKXpaVKUSdhC64b6-hDAYuvG5ko4wIgEAKgkC6AMA_0FCLIpSBCjDF5laBMlRg_JqJcIbdqziuE7tqBhjAewkT963mCRxMYH3k7Kx1JTMOkxn4VkVHVpyJYAGOpCH0NzJhlSn88D0t-oz-8JsP01VjsnjJ__3DTiIKvneTIk%26e%3D1653064235%26fl%3D%26r%3D1F88C139-A75D-4CC7-993A-D0D59EE3BEB5-1%26k%3D%24%7Buk%7D%26ckc%3Dcom.apple.largeattachment%26ckz%3DD19C7982-B501-4586-B5F4-924935FB4B7E%26p%3D28%26s%3DLM5t8poEK7W6fNZwxzO786dtwQ8&uk=StfP3VzN4a6lfOvGGwcayw&f=Ctrl-Flower-P1.mzXML.bz2&sz=126349148Ctrl-Flower-P1.mzXML.bz2 126.3 MB Click to Download https://www.icloud.com/attachment/?u=https%3A%2F%2Fcvws.icloud-content.com%2FB%2FAY97Gpy9HsfUBXI4WOzjmxbaAYW6AVywBT3LgyTCMUlVqzk8oFcjyNdd%2F%24%7Bf%7D%3Fo%3DAu9ithb2aP8Asl2yjGGFs01ZI9XXAIVd0e8xAMHsrWQC%26v%3D1%26x%3D3%26a%3DCAog9l_1iHqX9v-EsHUyFKXI3OyPKO-6CPV_YcYacXWBLs4SdhCz4r6-hDAYs_K5ko4wIgEAKgkC6AMA_16UGQhSBNoBhbpaBCPI111qJfB_9L88akObhMMB6JgmXDRKZySD4-7di8ZU0pVOjCz0lRSmn1NyJYs-NxDfTasQg_nJDSnXHbCL_AXKs5v93E6ibCOIx5ojfNIYg0M%26e%3D1653064235%26fl%3D%26r%3D756E3BD7-B4FB-4928-8BB5-DC0A48F7E89D-1%26k%3D%24%7Buk%7D%26ckc%3Dcom.apple.largeattachment%26ckz%3DD19C7982-B501-4586-B5F4-924935FB4B7E%26p%3D28%26s%3D8PaL6SIRLi5iX_dPR9mdxYM7s7I&uk=cUiDTXh9rpWAET2F2ahxTg&f=JA-Flower-P1.mzXML.bz2&sz=2541453JA-Flower-P1.mzXML.bz2 2.5 MB

On Apr 20, 2022, at 12:17 PM, Steffen Neumann @.*** @.***>> wrote:

Can you provide one of the problematic files ? If needed by private mail to me ? Yours, Steffen

— Reply to this email directly, view it on GitHub https://github.com/sneumann/xcms/issues/615#issuecomment-1104126935, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYZZ3EQEUWDJM45D5RVWZ63VGAUZBANCNFSM5T2SUNRA. You are receiving this because you authored the thread.

IloveAphid avatar Apr 20 '22 16:04 IloveAphid

Can you provide one of the problematic files ? If needed by private mail to me ? Yours, Steffen

Hi Steffen,

I think there are problems in the JA-files. I bz-ed a couple of files to send to you using mail-drop (not sure if you can receive them). While the Ctrl-files zipped from 180 MB to 126MB, the JA-files zipped from ~180 MB dramatically to 2.5 MB. There must be a problem during the transfer of those files from the Windows system, where the files were generated to my Mac.

I will try to transfer them again to see if the problem will be solved and keep you updated. Thank you very much.

Best regards, Honglin

IloveAphid avatar Apr 20 '22 16:04 IloveAphid

Hi, you can see already from the file size (2MB vs 150MB) that there is some difference. The problematic file ends abruptly after 88 scans, resulting in a malformed XML:

    <scan num="88"
          scanType="Full"
          centroided="1"
          msLevel="1"
          peaksCount="9076"
          polarity="+"
          retentionTime="PT57.1315S"
          lowMz="148.508603201958"
          highMz="1212.151691709216"
          basePeakMz="258.1100842"
          basePeakIntensity="2.8519922e07"
          totIonCurrent="1.4747586e08">
      <peaks compressionType="zlib"
             compressedLen="71298"
             precision="64"
             byteOrder="network"
             contentType="m/z-int">eJxM3Xc8VfH/B/CMKE0VImSkVNqlgXyuUaLSomVvZUdGyTUaFCkNkWwqGsgoe2aUUZmJa4VIQovw+/46vt/z8[...]

The good Ctrl file has 5257 scans, so something went wrong during the conversion. Yours, Steffen P.S.: Not neccessarily related to the issue, you might want to switch over to the newer (since 2010 :-) ) mzML format. It encodes the same spectral information, but has better metadata capabilities.

sneumann avatar Apr 20 '22 18:04 sneumann