peakHeatmap split.default error
When I run peakHeatmap, I am getting the following error:
peakHeatmap(MYB3sortGR, TxDb = txdb, upstream = 500, downstream = 500, ignore_strand = TRUE)
preparing promoter regions... 2024-01-13 02:13:09 AM
preparing tag matrix... 2024-01-13 02:13:09 AM
preparing start_site regions by gene... 2024-01-13 02:13:09 AM
preparing tag matrix... 2024-01-13 02:13:09 AM
Error in split.default(1:length(windows), as.factor(seqnames(windows))) :
group length is 0 but data length > 0
In addition: Warning message:
In .merge_two_Seqinfo_objects(x, y) :
The 2 combined objects have no sequence levels in common. (Use
suppressWarnings() to suppress this warning.)
The GRanges object, MYB3sortGR, looks like this:
object with 6213 ranges and 2 metadata columns:
seqnames ranges strand | V4 V5
<Rle> <IRanges> <Rle> | <character> <integer>
1 chr1 20174-21639 * | RepAll_peak_7 292
2 chr1 37751-38229 * | RepAll_peak_10 294
3 chr1 54918-55737 * | RepAll_peak_20 212
4 chr1 56447-61014 * | RepAll_peak_21 185
5 chr1 67017-68917 * | RepAll_peak_25 405
... ... ... ... . ... ...
6209 chr5 26938521-26939493 * | RepAll_peak_32796 475
6210 chr5 26949788-26950604 * | RepAll_peak_32801 316
6211 chr5 26957657-26958139 * | RepAll_peak_32803 165
6212 chr5 26959763-26960842 * | RepAll_peak_32804 196
6213 chr5 26969279-26969874 * | RepAll_peak_32809 203
-------
seqinfo: 5 sequences from an unspecified genome; no seqlengths
txdb looks like this:
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: BioMart
# Organism: Arabidopsis thaliana
# Taxonomy ID: 3702
# Resource URL: plants.ensembl.org:80
# BioMart database: plants_mart
# BioMart database version: Ensembl Plants Genes 51
# BioMart dataset: athaliana_eg_gene
# BioMart dataset description: Arabidopsis thaliana genes (TAIR10)
# BioMart dataset version: TAIR10
# Full dataset: yes
# miRBase build ID: TAIR10
# Nb of transcripts: 54013
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2021-05-23 17:17:33 +0200 (Sun, 23 May 2021)
# GenomicFeatures version at creation time: 1.45.0
# RSQLite version at creation time: 2.2.7
# DBSCHEMAVERSION: 1.2
The code works fine when I use tagMatrix. I get a heatmap.
promoter <- getPromoters(TxDb=txdb, upstream=3000, downstream=3000)
tagMatrix <- getTagMatrix(MYB3GR, windows = promoter)
peakHeatmap(MYB3GR, TxDb = txdb, upstream = 1000, downstream = 1000 )
The promoter GRanges object looks like this:
GRanges object with 32823 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 2631-4631 +
[2] chr1 8130-10130 -
[3] chr1 12714-14714 -
[4] chr1 22121-24121 +
[5] chr1 27500-29500 +
... ... ... ...
[32819] chrM 180268-182268 +
[32820] chrM 191025-193025 -
[32821] chrM 332725-334725 -
[32822] chrM 346266-348266 +
[32823] chrM 358739-360739 -
-------
seqinfo: 7 sequences from an unspecified genome; no seqlengths
My MYB3sortGR was made from a bed file downloaded from GEO. It was created by:
MYB3sort <- read.delim(file = "GSE80564_AT1G22640_MYB3_ABA_optimal_narrowPeak_p16_ChipSeek_sort.bed", sep = "\t", header = FALSE)
MYB3sortGR <- toGRanges(MYB3sort)
The top of the bed file looks like this:
chr5 7014051 7015074 RepAll_peak_27510 883 chr5 19837638 19838886 RepAll_peak_30304 862 chr2 7258604 7262109 RepAll_peak_9773 854 chr1 5590500 5591353 RepAll_peak_1885 836
MYB3sortGR does not have the leftmost column numbers in brackets, so I though maybe there is something wrong with the GRanges object, so I then added column names to my .bed file. The first few lines of the .bed file would now look like this:
chr | start | end | peak | score | chr1 | 20174 | 21639 | RepAll_peak_7 | 292 | chr1 | 37751 | 38229 | RepAll_peak_10 | 294 |
and imported the bed file into R using read.delim() and header = TRUE, to make the dataframe, MYB3sortCN.
From the dataframe, I made a GRanges object:
makeGRangesFromDataFrame(MYB3sortCN)
MYB3sortCN
GRanges object with 6213 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 20174-21639 *
[2] chr1 37751-38229 *
[3] chr1 54918-55737 *
[4] chr1 56447-61014 *
[5] chr1 67017-68917 *
... ... ... ...
[6209] chr5 26938521-26939493 *
[6210] chr5 26949788-26950604 *
[6211] chr5 26957657-26958139 *
[6212] chr5 26959763-26960842 *
[6213] chr5 26969279-26969874 *
-------
seqinfo: 5 sequences from an unspecified genome; no seqlengths
Now, the leftmost column numbers are in brackets as is for the promoters GR object. However, when I run peakHeatmap using this new MYB3sortCN GRanges object, I still get an error, albeit I different error.
peakHeatmap(MYB3sortCN, TxDb = txdb, upstream = 1000, downstream = 1000 )
preparing promoter regions... 2024-01-13 05:04:37 PM preparing tag matrix... 2024-01-13 05:04:37 PM preparing start_site regions by gene... 2024-01-13 05:04:38 PM preparing tag matrix... 2024-01-13 05:04:38 PM Error in file.exists(peak) : invalid 'file' argument
My sessioninfo:
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS (fossa-ditto X84)
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] clusterProfiler_4.8.3 regioneR_1.32.0
[3] BiocManager_1.30.22 ChIPseeker_1.39.0
[5] TxDb.Athaliana.BioMart.plantsmart51_0.99.0 GenomicFeatures_1.54.1
[7] AnnotationDbi_1.64.1 Biobase_2.62.0
[9] GenomicRanges_1.54.1 GenomeInfoDb_1.38.5
[11] IRanges_2.36.0 S4Vectors_0.40.2
[13] BiocGenerics_0.48.1
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 jsonlite_1.8.8
[3] rstudioapi_0.15.0 magrittr_2.0.3
[5] farver_2.1.1 fs_1.6.3
[7] BiocIO_1.12.0 zlibbioc_1.48.0
[9] vctrs_0.6.5 memoise_2.0.1
[11] Rsamtools_2.18.0 RCurl_1.98-1.14
[13] ggtree_3.10.0 S4Arrays_1.2.0
[15] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 progress_1.2.3
[17] plotrix_3.8-4 curl_5.2.0
[19] SparseArray_1.2.3 gridGraphics_0.5-1
[21] KernSmooth_2.23-22 plyr_1.8.9
[23] cachem_1.0.8 GenomicAlignments_1.38.1
[25] igraph_1.6.0 lifecycle_1.0.4
[27] pkgconfig_2.0.3 gson_0.1.0
[29] Matrix_1.6-3 R6_2.5.1
[31] fastmap_1.1.1 GenomeInfoDbData_1.2.11
[33] MatrixGenerics_1.14.0 digest_0.6.34
[35] aplot_0.2.2 enrichplot_1.20.3
[37] colorspace_2.1-0 patchwork_1.2.0
[39] RSQLite_2.3.4 labeling_0.4.3
[41] filelock_1.0.3 fansi_1.0.6
[43] httr_1.4.7 polyclip_1.10-6
[45] abind_1.4-5 compiler_4.3.2
[47] downloader_0.4 bit64_4.0.5
[49] withr_2.5.2 BiocParallel_1.36.0
[51] viridis_0.6.4 DBI_1.2.0
[53] gplots_3.1.3 ggforce_0.4.1
[55] biomaRt_2.58.0 MASS_7.3-60
[57] rappdirs_0.3.3 DelayedArray_0.28.0
[59] rjson_0.2.21 HDO.db_0.99.1
[61] caTools_1.18.2 gtools_3.9.5
[63] tools_4.3.2 scatterpie_0.2.1
[65] ape_5.7-1 glue_1.7.0
[67] restfulr_0.0.15 nlme_3.1-163
[69] GOSemSim_2.28.0 shadowtext_0.1.2
[71] grid_4.3.2 reshape2_1.4.4
[73] fgsea_1.28.0 generics_0.1.3
[75] BSgenome_1.68.0 gtable_0.3.4
[77] tidyr_1.3.0 data.table_1.14.10
[79] hms_1.1.3 tidygraph_1.3.0
[81] xml2_1.3.6 utf8_1.2.4
[83] XVector_0.42.0 ggrepel_0.9.5
[85] pillar_1.9.0 stringr_1.5.1
[87] yulab.utils_0.1.3 splines_4.3.2
[89] dplyr_1.1.4 tweenr_2.0.2
[91] treeio_1.26.0 BiocFileCache_2.10.1
[93] lattice_0.22-5 rtracklayer_1.62.0
[95] bit_4.0.5 tidyselect_1.2.0
[97] GO.db_3.18.0 Biostrings_2.70.1
[99] gridExtra_2.3 SummarizedExperiment_1.32.0
[101] graphlayouts_1.0.2 matrixStats_1.2.0
[103] stringi_1.8.3 lazyeval_0.2.2
[105] ggfun_0.1.3 yaml_2.3.8
[107] boot_1.3-28.1 codetools_0.2-19
[109] ggraph_2.1.0 tibble_3.2.1
[111] qvalue_2.34.0 ggplotify_0.1.2
[113] cli_3.6.2 munsell_0.5.0
[115] Rcpp_1.0.12 dbplyr_2.4.0
[117] png_0.1-8 XML_3.99-0.16
[119] parallel_4.3.2 ggplot2_3.4.4
[121] blob_1.2.4 prettyunits_1.2.0
[123] DOSE_3.28.2 bitops_1.0-7
[125] tidytree_0.4.6 viridisLite_0.4.2
[127] scales_1.3.0 purrr_1.0.2
[129] crayon_1.5.2 rlang_1.1.3
[131] cowplot_1.1.2 fastmatch_1.1-4
[133] KEGGREST_1.42.0
Any idea of what my problem is ?