marge icon indicating copy to clipboard operation
marge copied to clipboard

Something is wrong

Open dktanwar opened this issue 2 years ago • 0 comments

Hi @robertamezquita

Nice package. I have successfully used it in past for mouse data, however I am facing some issues here with the TAIR10 genome. Do you have any see any issues here?

Thank you.

Data

head(data)
       seqnames   start     end       name width
1        2 3241484 3242161 region_141   678
2        2 3264247 3266248 region_151  2002
3        2 3278221 3278761 region_156   541
4        2 3279239 3280592 region_157  1354
5        2 3292341 3293676 region_162  1336
6        2 3294158 3294957 region_163   800

Background

head(bg)
      seqnames   start     end     name width
1        1  640542  640966 region_1   425
2        1 7498331 7498558 region_2   228
3        1 8392125 8392403 region_3   279
4        1 8806819 8807899 region_4  1081
5        1 9135657 9136059 region_5   403
6        1 9137844 9137974 region_6   131

Running analysis

find_motifs_genome(x = data, path = "output/", genome = "tair10",
                   motif_length = c(6, 8, 10, 12), scan_size = 100, 
                   optimize_count = 8, background = bg, 
                   local_background = FALSE, only_known = FALSE, 
                   only_denovo = FALSE, fdr_num = 5, cores = 10, 
                   overwrite = TRUE, keep_minimal = FALSE)

Message

	Position file = /tmp/RtmpSTvVaO/target_19236370535e
	Genome = tair10
	Output Directory = output/CpG/
	Motif length set at 6,8,10,12,
	Fragment size set to 100
	Will optimize 8 putative motifs
	Using 10 CPUs
	Using 1000 MB for statistics cache
	Will randomize and repeat motif finding 5 times to estimate FDR
	background position file: /tmp/RtmpSTvVaO/background_19231d5f8434
	Found mset for "arabidopsis", will check against plants motifs
	Peak/BED file conversion summary:
		BED/Header formatted lines: 555
		peakfile formatted lines: 0

	Peak File Statistics:
		Total Peaks: 555
		Redundant Peak IDs: 0
		Peaks lacking information: 0 (need at least 5 columns per peak)
		Peaks with misformatted coordinates: 0 (should be integer)
		Peaks with misformatted strand: 0 (should be either +/- or 0/1)

	Peak file looks good!

	Peak/BED file conversion summary:
		BED/Header formatted lines: 555
		peakfile formatted lines: 0
	Max distance to merge: 100 bp
	Calculating co-bound peaks relative to reference: output/CpG//bg.clean.pos

	Comparing peaks: (peakfile, overlapping peaks, logRatio(obs/expected), logP)
	**	output/CpG//target.clean..pos	555	6.32	-5989.42

	**	Pairwise stats are approx with fixed distance (-d 100) and -cobound #
		They get worse as the size increases and peaks from a single file start overlapping
		To get accurate ones, adjust peak sizes first with adjustPeakFile.pl and
		then rerun mergePeaks with the "-d given" option (only applies to -cobound #)

	Co-bound by 0 peaks: 0
	Co-bound by 1 peaks: 555 (max: 555 effective total)
	Custom genome sequence directory: /home/rstudio/r_lib/homer/.//data/genomes/tair10//

	Extracting sequences from file: /home/rstudio/r_lib/homer/.//data/genomes/tair10///genome.fa
	Looking for peak sequences in a single file (/home/rstudio/r_lib/homer/.//data/genomes/tair10///genome.fa)
	Extracting 116 sequences from 1
	Extracting 141 sequences from 2
	Extracting 90 sequences from 3
	Extracting 61 sequences from 4
	Extracting 120 sequences from 5

	Not removing redundant sequences


	Sequences processed:
		Auto detected maximum sequence length of 101 bp
		528 total

	Frequency Bins: 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.6 0.7 0.8
	Freq	Bin	Count
	0.2	0	23
	0.25	1	23
	0.3	2	60
	0.35	3	86
	0.4	4	87
	0.45	5	125
	0.5	6	73
	0.6	7	46
	0.7	8	5
	Bin	# Targets	# Background	Background Weight
	Normalizing lower order oligos using homer2

	Reading input files...
	0 total sequences read
	Autonormalization: 1-mers (4 total)
		A	inf%	inf%	-nan
		C	inf%	inf%	-nan
		G	inf%	inf%	-nan
		T	inf%	inf%	-nan
	Autonormalization: 2-mers (16 total)
		AA	inf%	inf%	-nan
		CA	inf%	inf%	-nan
		GA	inf%	inf%	-nan
		TA	inf%	inf%	-nan
		AC	inf%	inf%	-nan
		CC	inf%	inf%	-nan
		GC	inf%	inf%	-nan
		TC	inf%	inf%	-nan
		AG	inf%	inf%	-nan
		CG	inf%	inf%	-nan
		GG	inf%	inf%	-nan
		TG	inf%	inf%	-nan
		AT	inf%	inf%	-nan
		CT	inf%	inf%	-nan
		GT	inf%	inf%	-nan
		TT	inf%	inf%	-nan
	Autonormalization: 3-mers (64 total)
	Normalization weights can be found in file: output/CpG//seq.autonorm.tsv
	Converging on autonormalization solution:
	...............................................................................
	Final normalization:	Autonormalization: 1-mers (4 total)
		A	inf%	inf%	-nan
		C	inf%	inf%	-nan
		G	inf%	inf%	-nan
		T	inf%	inf%	-nan
	Autonormalization: 2-mers (16 total)
		AA	inf%	inf%	-nan
		CA	inf%	inf%	-nan
		GA	inf%	inf%	-nan
		TA	inf%	inf%	-nan
		AC	inf%	inf%	-nan
		CC	inf%	inf%	-nan
		GC	inf%	inf%	-nan
		TC	inf%	inf%	-nan
		AG	inf%	inf%	-nan
		CG	inf%	inf%	-nan
		GG	inf%	inf%	-nan
		TG	inf%	inf%	-nan
		AT	inf%	inf%	-nan
		CT	inf%	inf%	-nan
		GT	inf%	inf%	-nan
		TT	inf%	inf%	-nan
	Autonormalization: 3-mers (64 total)
	Finished preparing sequence/group files

	----------------------------------------------------------
	Known motif enrichment

	Reading input files...
	0 total sequences read
	506 motifs loaded
	Cache length = 15811
	Using binomial scoring
	Checking enrichment of 506 motif(s)
	|0%                                    50%                                  100%|
	=================================================================================
Illegal division by zero at /home/rstudio/r_lib/homer//bin/findKnownMotifs.pl line 152.
	----------------------------------------------------------
	De novo motif finding (HOMER)

	Scanning input files...
!!! Something is wrong... are you sure you chose the right length for motif finding?
!!! i.e. also check your sequence file!!!
	Performing empirical FDR calculation for length 6 (n=5)
		1 of 5
		2 of 5
		3 of 5
		4 of 5
		5 of 5

	Scanning input files...
!!! Something is wrong... are you sure you chose the right length for motif finding?
!!! i.e. also check your sequence file!!!
	Performing empirical FDR calculation for length 8 (n=5)
		1 of 5
		2 of 5
		3 of 5
		4 of 5
		5 of 5

	Scanning input files...
!!! Something is wrong... are you sure you chose the right length for motif finding?
!!! i.e. also check your sequence file!!!
	Performing empirical FDR calculation for length 10 (n=5)
		1 of 5
		2 of 5
		3 of 5
		4 of 5
		5 of 5

	-blen automatically set to 2
	Scanning input files...
!!! Something is wrong... are you sure you chose the right length for motif finding?
!!! i.e. also check your sequence file!!!
	Performing empirical FDR calculation for length 12 (n=5)
		1 of 5
		2 of 5
		3 of 5
		4 of 5
		5 of 5
Use of uninitialized value in numeric gt (>) at /home/rstudio/r_lib/homer//bin/compareMotifs.pl line 1394.
	!!! Filtered out all motifs!!!
	Job finished - if results look good, please send beer to ..

	Cleaning up tmp files...

Warning message:
In background != "automatic" && local_background != FALSE :
  'length(x) = 11100 > 1' in coercion to 'logical(1)'

SessionInfo

xfun::session_info()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS, RStudio 2022.7.1.554

Locale:
  LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
  LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

Package version:
  AnnotationDbi_1.58.0                       askpass_1.1                                assertthat_0.2.1                          
  base64enc_0.1.3                            bayestestR_0.13.0                          beachmat_2.12.0                           
  BH_1.78.0.0                                Biobase_2.56.0                             BiocFileCache_2.4.0                       
  BiocGenerics_0.42.0                        BiocIO_1.6.0                               BiocParallel_1.30.3                       
  biomaRt_2.52.0                             Biostrings_2.64.1                          bit_4.0.4                                 
  bit64_4.0.5                                bitops_1.0-7                               blob_1.2.3                                
  brew_1.0.7                                 brio_1.1.3                                 BSgenome_1.64.0                           
  bslib_0.4.0                                bsseq_1.32.0                               cachem_1.0.6                              
  callr_3.7.2                                cli_3.3.0                                  clipr_0.8.0                               
  codetools_0.2-18                           colorspace_2.0-3                           commonmark_1.8.0                          
  compiler_4.2.1                             cpp11_0.4.2                                crayon_1.5.1                              
  credentials_1.3.2                          crosstalk_1.2.0                            curl_4.3.2                                
  data.table_1.14.2                          datawizard_0.6.2                           DBI_1.1.3                                 
  dbplyr_2.2.1                               DelayedArray_0.22.0                        DelayedMatrixStats_1.18.1                 
  desc_1.4.1                                 details_0.3.0                              devtools_2.4.4                            
  diffobj_0.3.5                              digest_0.6.29                              downlit_0.4.2                             
  dplyr_1.0.10                               DT_0.25                                    effectsize_0.7.0.5                        
  ellipsis_0.3.2                             evaluate_0.16                              fansi_1.0.3                               
  farver_2.1.1                               fastmap_1.1.0                              filelock_1.0.2                            
  fontawesome_0.3.0                          formatR_1.12                               fs_1.5.2                                  
  futile.logger_1.4.3                        futile.options_1.0.1                       generics_0.1.3                            
  GenomeInfoDb_1.32.4                        GenomeInfoDbData_1.2.8                     GenomicAlignments_1.32.1                  
  GenomicFeatures_1.48.4                     GenomicRanges_1.48.0                       gert_1.8.0                                
  gh_1.3.0                                   gitcreds_0.1.1                             glue_1.6.2                                
  graphics_4.2.1                             grDevices_4.2.1                            grid_4.2.1                                
  gtools_3.9.3                               HDF5Array_1.24.2                           highr_0.9                                 
  hms_1.1.2                                  htmltools_0.5.3                            htmlwidgets_1.5.4                         
  httpuv_1.6.5                               httr_1.4.4                                 ini_0.3.1                                 
  insight_0.18.4                             IRanges_2.30.1                             jquerylib_0.1.4                           
  jsonlite_1.8.0                             KEGGREST_1.36.3                            knitr_1.40                                
  labeling_0.4.2                             lambda.r_1.2.4                             later_1.3.0                               
  lattice_0.20-45                            lazyeval_0.2.2                             lifecycle_1.0.1                           
  limma_3.52.4                               locfit_1.5-9.6                             magrittr_2.0.3                            
  marge_0.0.4.9999                           Matrix_1.5-1                               MatrixGenerics_1.8.1                      
  matrixStats_0.62.0                         memoise_2.0.1                              methods_4.2.1                             
  mime_0.12                                  miniUI_0.1.1.1                             munsell_0.5.0                             
  openssl_2.0.2                              parallel_4.2.1                             parameters_0.18.2                         
  performance_0.10.0                         permute_0.9-7                              pillar_1.8.1                              
  pkgbuild_1.3.1                             pkgconfig_2.0.3                            pkgdown_2.0.6                             
  pkgload_1.3.0                              plogr_0.2.0                                png_0.1-7                                 
  praise_1.0.0                               prettyunits_1.1.1                          processx_3.7.0                            
  profvis_0.3.7                              progress_1.2.2                             promises_1.2.0.1                          
  ps_1.7.1                                   purrr_0.3.4                                R.cache_0.16.0                            
  R.methodsS3_1.8.2                          R.oo_1.25.0                                R.utils_2.12.0                            
  R6_2.5.1                                   ragg_1.2.2                                 rappdirs_0.3.3                            
  rcmdcheck_1.4.0                            RColorBrewer_1.1.3                         Rcpp_1.0.9                                
  RCurl_1.98-1.9                             readr_2.1.2                                rematch2_2.1.2                            
  remotes_2.4.2                              report_0.5.5                               restfulr_0.0.15                           
  rhdf5_2.40.0                               rhdf5filters_1.8.0                         Rhdf5lib_1.18.2                           
  Rhtslib_1.28.0                             rjson_0.2.21                               rlang_1.0.5                               
  rmarkdown_2.16                             roxygen2_7.2.1                             rprojroot_2.0.3                           
  Rsamtools_2.12.0                           RSQLite_2.2.16                             rstudioapi_0.14                           
  rtracklayer_1.56.1                         rversions_2.1.2                            S4Vectors_0.34.0                          
  sass_0.4.2                                 scales_1.2.1                               sessioninfo_1.2.2                         
  shiny_1.7.2                                snow_0.4.4                                 sourcetools_0.1.7                         
  sparseMatrixStats_1.8.0                    stats_4.2.1                                stats4_4.2.1                              
  stringi_1.7.8                              stringr_1.4.1                              styler_1.7.0                              
  SummarizedExperiment_1.26.1                sys_3.4                                    systemfonts_1.0.4                         
  testthat_3.1.4                             textshaping_0.3.6                          tibble_3.1.8                              
  tidyr_1.2.0                                tidyselect_1.1.2                           tinytex_0.41                              
  tools_4.2.1                                TxDb.Athaliana.BioMart.plantsmart51_0.99.0 tzdb_0.3.0                                
  urlchecker_1.0.1                           usethis_2.1.6                              utf8_1.2.2                                
  utils_4.2.1                                vctrs_0.4.1                                viridisLite_0.4.1                         
  vroom_1.5.7                                waldo_0.4.0                                whisker_0.4                               
  withr_2.5.0                                writexl_1.4.0                              xfun_0.32                                 
  XML_3.99-0.11                              xml2_1.3.3                                 xopen_1.0.0                               
  xtable_1.8-4                               XVector_0.36.0                             yaml_2.3.5                                
  zip_2.2.0                                  zlibbioc_1.42.0                           

dktanwar avatar Nov 10 '22 15:11 dktanwar