sesame
sesame copied to clipboard
Probe missing from Manifest of HM450 and EPIC
Hi Wanding, thank you for developing the package in place of Minfi.
I have encountered an error trying to annotate the methylation data:
While analyzing a dataset for HM450 Human methylation, the following error occurs:
Error: subscript contains invalid names
After a manual binary search, one of the problematic probes is "cg01238044"
> sesameData::sesameData_annoProbes("cg01238044",platform = "HM450")
Error: subscript contains invalid names
As it turns out, this probe is not in the Manifest of EPIC and HM450
> sesameData_getManifestGRanges("EPIC")["cg01238044"]
Error: subscript contains invalid names
> sesameData_getManifestGRanges("HM450")["cg01238044"]
Error: subscript contains invalid names
The sessionInfo is as followed:
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sesame_1.14.2 sesameData_1.14.0 ExperimentHub_2.4.0
[4] AnnotationHub_3.4.0 BiocFileCache_2.4.0 dbplyr_2.2.1
[7] BiocGenerics_0.42.0
loaded via a namespace (and not attached):
[1] bitops_1.0-7 matrixStats_0.62.0
[3] bit64_4.0.5 filelock_1.0.2
[5] RColorBrewer_1.1-3 httr_1.4.3
[7] GenomeInfoDb_1.32.3 tools_4.2.0
[9] utf8_1.2.2 R6_2.5.1
[11] DBI_1.1.3 colorspace_2.0-3
[13] tidyselect_1.1.2 base64_2.0
[15] bit_4.0.4 curl_4.3.2
[17] compiler_4.2.0 preprocessCore_1.58.0
[19] cli_3.3.0 Biobase_2.56.0
[21] DelayedArray_0.22.0 scales_1.2.0
[23] readr_2.1.2 askpass_1.1
[25] rappdirs_0.3.3 stringr_1.4.0
[27] digest_0.6.29 illuminaio_0.38.0
[29] XVector_0.36.0 pkgconfig_2.0.3
[31] htmltools_0.5.3 MatrixGenerics_1.8.1
[33] fastmap_1.1.0 rlang_1.0.2
[35] RSQLite_2.2.15 shiny_1.7.1
[37] generics_0.1.3 wheatmap_0.2.0
[39] BiocParallel_1.30.3 dplyr_1.0.9
[41] RCurl_1.98-1.7 magrittr_2.0.3
[43] GenomeInfoDbData_1.2.8 Matrix_1.4-1
[45] Rcpp_1.0.9 munsell_0.5.0
[47] S4Vectors_0.34.0 fansi_1.0.3
[49] lifecycle_1.0.1 stringi_1.7.6
[51] yaml_2.3.5 MASS_7.3-56
[53] SummarizedExperiment_1.26.1 zlibbioc_1.42.0
[55] plyr_1.8.7 grid_4.2.0
[57] blob_1.2.3 parallel_4.2.0
[59] promises_1.2.0.1 crayon_1.5.1
[61] lattice_0.20-45 Biostrings_2.64.0
[63] hms_1.1.1 KEGGREST_1.36.3
[65] pillar_1.7.0 GenomicRanges_1.48.0
[67] reshape2_1.4.4 codetools_0.2-18
[69] stats4_4.2.0 glue_1.6.2
[71] BiocVersion_3.15.2 BiocManager_1.30.18
[73] png_0.1-7 vctrs_0.4.1
[75] tzdb_0.3.0 httpuv_1.6.5
[77] gtable_0.3.0 openssl_2.0.2
[79] purrr_0.3.4 assertthat_0.2.1
[81] cachem_1.0.6 ggplot2_3.3.6
[83] mime_0.12 xtable_1.8-4
[85] later_1.3.0 tibble_3.1.7
[87] AnnotationDbi_1.58.0 memoise_2.0.1
[89] IRanges_2.30.0 ellipsis_0.3.2
[91] interactiveDisplayBase_1.34.0
Thanks for reporting. This is due to the probe being mapped to a decoy contig in hg38. We will need to update the manifest so we don't lose any probes in the manifest GRanges. For now, I added a check (v1.15.1) to exclude such probes from the annotation instead of causing the code to abort.