PopSV
PopSV copied to clipboard
could not find function "autoGCcounts"
Hi PopSV developers,
I am receiving the error message:
could not find function "autoGCcounts"
although, I already loaded
library (PopSV)
Any help appreciated,
Waqas.
Hi,
The pre-build pipeline functions are in the automatedPipeline-batchtools.R
script. You have to run source("automatedPipeline-batchtools.R")
to read this file and load the autoGCcounts and autoNormTest functions.
The automatedPipeline-batchtools.R
script and other files for configuring your HPC are in the scripts
folder.
Let me know how it goes,
Jean
Thanks for your response, So I am using the automated one right now, but still struggling!!!!!
source ("run-PopSV-batchjobs-automatedPipeline.R")
Error in cfReadBrewTemplate(template.file) :
could not find function "cfReadBrewTemplate"
Error: package or namespace load failed for ‘BatchJobs’:
.onLoad failed in loadNamespace() for 'BatchJobs', details:
call: sourceConfFile(cf)
error: There was an error in sourcing your configuration file '/home/wuk/software/PopSV/scripts/.BatchJobs.R': Error in cfReadBrewTemplate(template.file) :
could not find function "cfReadBrewTemplate"
When I:
library (BatchJobs)
I also got the same error.
My BatchJobs.R file looks like this:
source("~/makeClusterFunctionsAdaptive.R")
cluster.functions <- makeClusterFunctionsAdaptive("~/guillimin.tmpl")
mail.start <- "none"
mail.done <- "none"
mail.error <- "none"
mail.from <- "<[email protected]>"
mail.to <- "<[email protected]>"
location of ~/makeClusterFunctionsAdaptive.R
Am I missing something, some paths etc,???
Waqas.
I forgot to mention that I am trying to run PopSV on a single server, not on HPC
I would recommend that you try using the batchtools version (the configuration is easier). If you want to run this on a single server you can use the configuration file batchtools.conf.local.R
(you can choose the number of cores to use by changing the ncpus=
argument). To use it rename it to batchtools.conf.R
and place it in the working directory.
To test if batchtools is configured properly you can try to run the commands in the test script. If it works you can move to running the pipeline.
BTW, is there a reason why you want to run this on a single server versus a HPC ?
I did as you said, and checked with the test-batchtools.R. It worked fine.
> library (batchtools)
Loading required package: data.table
data.table 1.10.4.3
The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
Release notes, videos and slides: http://r-datatable.com
Breaking change introduced in batchtools v0.9.6: The format of the returned data.table of the functions `reduceResultsDataTable()`, getJobTable()`, `getJobPars()`, and `getJobResources()` has changed. List columns are not unnested automatically anymore. To manually unnest tables, batchtools provides the helper function `unwrap()` now, e.g. `unwrap(getJobPars())`. The previously introduced helper function `flatten()` will be deprecated due to a name clash with `purrr::flatten()`.
> library(PopSV)
> source ("test-batchtools.R")
Sourcing configuration file '/home/wuk/software/PopSV/scripts/batchtools/batchtools.conf.R' ...
Created registry in '/home/wuk/software/PopSV/scripts/batchtools/test' using cluster functions 'Multicore'
Adding 2 jobs ...
Submitting 2 jobs in 2 chunks using cluster functions 'Multicore' ...
>
With the actual data, it quite worked well but ended with an error:
wuk@wuk-Precision-Tower-7810:~/software/PopSV/scripts/batchtools$ R
R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library (batchtools)
Loading required package: data.table
data.table 1.10.4.3
The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
Release notes, videos and slides: http://r-datatable.com
Breaking change introduced in batchtools v0.9.6: The format of the returned data.table of the functions `reduceResultsDataTable()`, getJobTable()`, `getJobPars()`, and `getJobResources()` has changed. List columns are not unnested automatically anymore. To manually unnest tables, batchtools provides the helper function `unwrap()` now, e.g. `unwrap(getJobPars())`. The previously introduced helper function `flatten()` will be deprecated due to a name clash with `purrr::flatten()`.
> library(PopSV)
> source("automatedPipeline-batchtools.R")
Functions :
- 'autoGCcounts' to count BC in each sample.
- 'autoNormTest' to normalize and test all the samples.
- 'autoExtra' for some other functions.
> bam.files = read.table("bams.tsv", as.is=TRUE, header=TRUE)
> files.df = init.filenames(bam.files, code="example")
> save(files.df, file="files.RData")
> bin.size = 1e3
> bins.df = fragment.genome.hg19(bin.size)
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, cbind, colMeans, colnames,
colSums, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match,
mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which, which.max, which.min
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:data.table’:
first, second
The following object is masked from ‘package:base’:
expand.grid
Attaching package: ‘IRanges’
The following object is masked from ‘package:data.table’:
shift
Attaching package: ‘Biostrings’
The following object is masked from ‘package:base’:
strsplit
> save(bins.df, file="bins.RData")
> res.GCcounts = autoGCcounts("files.RData", "bins.RData")
== 1) Get GC content in each bin.
Sourcing configuration file '/home/wuk/software/PopSV/scripts/batchtools/batchtools.conf.R' ...
Created registry in '/home/wuk/software/PopSV/scripts/batchtools/getGC' using cluster functions 'Multicore'
Adding 1 jobs ...
Submitting 1 jobs in 1 chunks using cluster functions 'Multicore' ...
== 2) Get bin counts in each sample and correct for GC bias.
Sourcing configuration file '/home/wuk/software/PopSV/scripts/batchtools/batchtools.conf.R' ...
Created registry in '/home/wuk/software/PopSV/scripts/batchtools/getBC' using cluster functions 'Multicore'
Adding 3 jobs ...
Submitting 3 jobs in 3 chunks using cluster functions 'Multicore' ...
Waiting (Q:0 R:1 D:0 E:2 ?:0) [=====================-----------] 67% eta: 1h
Waiting (Q:0 R:1 D:0 E:2 ?:0) [=====================-----------] 67% eta: 2h
Status for 3 jobs:
Submitted : 3 (100.0%)
-- Queued : 0 ( 0.0%)
-- Started : 3 (100.0%)
---- Running : 0 ( 0.0%)
---- Done : 0 ( 0.0%)
---- Error : 3 (100.0%)
---- Expired : 0 ( 0.0%)
Mean run time: 1.19 hours.
Error in autoGCcounts("files.RData", "bins.RData") :
Not done yet or failed, see for yourself
> cnvs.df = autoNormTest("files.RData", "bins.RData")
== 1) Sample QC and reference definition.
Sourcing configuration file '/home/wuk/software/PopSV/scripts/batchtools/batchtools.conf.R' ...
Created registry in '/home/wuk/software/PopSV/scripts/batchtools/sampQC' using cluster functions 'Multicore'
Adding 1 jobs ...
Submitting 1 jobs in 1 chunks using cluster functions 'Multicore' ...
Status for 1 jobs:
Submitted : 1 (100.0%)
-- Queued : 0 ( 0.0%)
-- Started : 1 (100.0%)
---- Running : 0 ( 0.0%)
---- Done : 0 ( 0.0%)
---- Error : 1 (100.0%)
---- Expired : 0 ( 0.0%)
Mean run time: 0.0011 hours.
Error in autoNormTest("files.RData", "bins.RData") :
Not done yet or failed, see for yourself
It requires your attention again.
I want to tell you further that I have reference / samples study. Right now, I have only three samples. With three samples, how much number of references I need?
Thanks in advance,
Waqas.
if you miss the thread, any help is appreciated!!!
Waqas.
Thanks for your patience. There seems to be errors in the second step of the autoGCcounts
function. I updated the pipeline functions to show a log of the errors when using the argument status=TRUE
. Can you rerun the following to have more information about the errors:
## Download the new version of automatedPipeline-batchtools.R
source("automatedPipeline-batchtools.R")
res.GCcounts = autoGCcounts("files.RData", "bins.RData", status=TRUE)
In term of references, we recommend to have at least 20 reference samples (40-50 would be better though). PopSV is not suited to analyze 3 samples only. Do you have controls that were sequenced similarly and that you could use as reference ?
Thanks, previous erorr was resolved, just facing the error on last command:
> cnvs.df = autoNormTest("files.RData", "bins.RData")
== 1) Sample QC and reference definition.
Sourcing configuration file '/home/wuk/software/PopSV/scripts/batchtools/batchtools.conf.R' ...
Created registry in '/home/wuk/software/PopSV/scripts/batchtools/sampQC' using cluster functions 'Multicore'
Adding 1 jobs ...
Submitting 1 jobs in 1 chunks using cluster functions 'Multicore' ...
Status for 1 jobs:
Submitted : 1 (100.0%)
-- Queued : 0 ( 0.0%)
-- Started : 1 (100.0%)
---- Running : 0 ( 0.0%)
---- Done : 0 ( 0.0%)
---- Error : 1 (100.0%)
---- Expired : 0 ( 0.0%)
Mean run time: 0.000882 hours.
Error in autoNormTest("files.RData", "bins.RData") :
Not done yet or failed, see for yourself
Thanks for support till here,
Waqas.