phables
phables copied to clipboard
No case3 is resolved
Hi, I'm new to this field, I was recently reading an article called "Phables: from fragmented assemblies to high quality bacteriophage genomes" and I wanted to reproduce some of the results, however, I ran into some problems. I simulated reads from the following four phages with the respective read coverage values and created a simulated phage dataset.
- Enterobacteria phage P22 (AB426868) - 100X
- Enterobacteria phage T7 (NC_001604) - 150X
- Staphylococcus phage SAP13 TA-2022 (ON911718) -200X
- Staphylococcus phage SAP2 TA-2022 (ON911715) -400X Then I used metaSPAdes to obtain the assembly graph, I'm sure the assembly graph is correct. Phables ran successfully, but no case3 is resolved.
`(phables) [wangbo@manager ~]$ phables run --output /data/home/wangbo/XTC/result/test --threads 8 --input /data/home/wangbo/XTC/fga/test/assembly_graph_with_scaffolds.gfa --reads /data/home/wangbo/XTC/data/test [2024:04:12 16:33:29] Copying system default config to /data/home/wangbo/XTC/result/test/config.yaml [2024:04:12 16:33:29] Updating config file with new values [2024:04:12 16:33:29] Writing config file to /data/home/wangbo/XTC/result/test/config.yaml [2024:04:12 16:33:29] ------------------ [2024:04:12 16:33:29] | Runtime config | [2024:04:12 16:33:29] ------------------
alpha: 1.2 compcount: 200 conda_prefix: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda configfile: /data/home/wangbo/XTC/result/test/config.yaml covtol: 100 databases: null evalue: 1.0e-10 input: /data/home/wangbo/XTC/fga/test/assembly_graph_with_scaffolds.gfa log: /data/home/wangbo/XTC/result/test/phables.log longreads: false maxpaths: 10 mgfrac: 0.2 mincov: 10 minlength: 2000 output: /data/home/wangbo/XTC/result/test prefix: null profile: null reads: /data/home/wangbo/XTC/data/test resources: jobCPU: 8 jobMem: 16000 seqidentity: 0.3 snake_args: [] snake_default:
- --rerun-incomplete
- --printshellcmds
- --nolock
- --show-failed-logs system_config: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/config/config.yaml threads: 8 use_conda: true
[2024:04:12 16:33:29] --------------------- [2024:04:12 16:33:29] | Snakemake command | [2024:04:12 16:33:29] ---------------------
snakemake -s /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/phables.smk --configfile /data/home/wangbo/XTC/result/test/config.yaml --cores 8 --use-conda --conda-prefix /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda --rerun-incomplete --printshellcmds --nolock --show-failed-logs Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../config/config.yaml is extended by additional config specified via the command line. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../config/databases.yaml is extended by additional config specified via the command line. Output files will be saved to directory, /data/home/wangbo/XTC/result/test
Assuming unrestricted shared filesystem usage. Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 8 Rules claiming more threads will be scaled down. Job stats: job count
all 1 combine_genomes_and_unresolved_edges 1 koverage 1 koverage_genomes 1 koverage_postprocess 1 koverage_tsv 1 run_combine_cov 1 run_gfa2fasta 1 run_phables 1 scan_phrogs 1 scan_smg 1 total 11
Select jobs to execute... Execute 2 jobs...
[Fri Apr 12 16:34:01 2024] localrule koverage_tsv: output: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv jobid: 3 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv resources: tmpdir=/tmp
[Fri Apr 12 16:34:01 2024] localrule run_gfa2fasta: input: /data/home/wangbo/XTC/fga/test/assembly_graph_with_scaffolds.gfa output: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta log: /data/home/wangbo/XTC/result/test/logs/gfa2fasta.log jobid: 1 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta resources: tmpdir=/tmp
Output files will be saved to directory, /data/home/wangbo/XTC/result/test
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/873ad7e71dfcd9bb5adf1eb8f8dc2377_ [Fri Apr 12 16:34:03 2024] Finished job 3. 1 of 11 steps (9%) done Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/873ad7e71dfcd9bb5adf1eb8f8dc2377_ 2024-04-12 16:34:09,382 - INFO - Obtaining edge sequences 2024-04-12 16:34:09,407 - INFO - Writing edge sequences to FASTA file 2024-04-12 16:34:09,640 - INFO - The FASTA file with unitig sequences can be found at /data/home/wangbo/XTC/result/test/preprocess/edges.fasta 2024-04-12 16:34:09,641 - INFO - Thank you for using gfa2fasta! [Fri Apr 12 16:34:09 2024] Finished job 1. 2 of 11 steps (18%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:34:09 2024] localrule koverage: input: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta output: /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam.bai, /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv jobid: 2 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam.bai; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv threads: 8 resources: tmpdir=/tmp, mem_mb=15259, mem_mib=15259, mem=16000MB
koverage run coverm --reads /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv --ref /data/home/wangbo/XTC/result/test/preprocess/edges.fasta --threads 8 --output /data/home/wangbo/XTC/result/test/preprocess
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_
βββ βββ βββββββ βββ ββββββββββββββββββ ββββββ βββββββ ββββββββ
βββ ββββββββββββββββ βββββββββββββββββββββββββββββββββββ ββββββββ
βββββββ βββ ββββββ βββββββββ βββββββββββββββββββ ββββββββββ
βββββββ βββ βββββββ ββββββββββ βββββββββββββββββββ βββββββββ
βββ ββββββββββββ βββββββ βββββββββββ ββββββ ββββββββββββββββββββ
βββ βββ βββββββ βββββ βββββββββββ ββββββ βββ βββββββ ββββββββ
[2024:04:12 16:34:12] Copying system default config to /data/home/wangbo/XTC/result/test/preprocess/koverage.config.yaml [2024:04:12 16:34:12] Updating config file with new values [2024:04:12 16:34:12] Writing config file to /data/home/wangbo/XTC/result/test/preprocess/koverage.config.yaml [2024:04:12 16:34:12] ------------------ [2024:04:12 16:34:12] | Runtime config | [2024:04:12 16:34:12] ------------------
koverage: args: bin_width: 100 conda_prefix: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda configfile: /data/home/wangbo/XTC/result/test/preprocess/koverage.config.yaml kmer_max: 5000 kmer_min: 50 kmer_sample: 100 kmer_size: 25 log: /data/home/wangbo/XTC/result/test/preprocess/koverage.log minimap: sr output: /data/home/wangbo/XTC/result/test/preprocess pafs: false reads: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv ref: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta report: true report_max_ctg: 1000 snake_args: - coverm system_config: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/config/config.yaml system_workflow_profile: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/config/profile/config.yaml threads: 8 use_conda: true workflow_profile: /data/home/wangbo/XTC/result/test/preprocess/koverage.profile params: coverm: -m count -m rpkm -m tpm -m mean -m covered_fraction -m variance jellyfish: -C -s 1G -c 2 --out-counter-len=2 -L 2 resources: med: cpu: 4 mem: 16000 time: 02:00:00 ram: cpu: 2 mem: 8000 time: 04:00:00
[2024:04:12 16:34:12] Copying system default config to /data/home/wangbo/XTC/result/test/preprocess/koverage.profile/config.yaml [2024:04:12 16:34:12] --------------------- [2024:04:12 16:34:12] | Snakemake command | [2024:04:12 16:34:12] ---------------------
snakemake -s /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/Snakefile --configfile /data/home/wangbo/XTC/result/test/preprocess/koverage.config.yaml --cores 8 --use-conda --conda-prefix /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda coverm --workflow-profile /data/home/wangbo/XTC/result/test/preprocess/koverage.profile Using profile /data/home/wangbo/XTC/result/test/preprocess/koverage.profile and workflow specific profile /data/home/wangbo/XTC/result/test/preprocess/koverage.profile for setting default command line arguments. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/config.yaml is extended by additional config specified via the command line. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/system_config.yaml is extended by additional config specified via the command line. Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 8 Rules claiming more threads will be scaled down. Job stats: job count
coverm 1 coverm_bam2counts 1 coverm_combine 1 coverm_map_pe 1 sample_tsv 1 total 5
Select jobs to execute...
[Fri Apr 12 16:34:32 2024] rule coverm_map_pe: input: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/XTC/data/test/test_1.fq output: /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam.bai log: /data/home/wangbo/XTC/result/test/preprocess/logs/coverm_map_pe.test.err jobid: 3 benchmark: /data/home/wangbo/XTC/result/test/preprocess/benchmarks/coverm_map_pe.test.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam wildcards: sample=test threads: 4 resources: tmpdir=/tmp, mem_mb=16000, mem_mib=15259, mem=16000MB, time=02:00:00
{ minimap2 -t 4 -ax sr --secondary=no /data/home/wangbo/XTC/result/test/preprocess/edges.fasta /data/home/wangbo/XTC/data/test/test_1.fq /data/home/wangbo/XTC/data/test/test_2.fq | samtools sort -T test -@ 4 - > /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam; samtools index /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam; } 2> /data/home/wangbo/XTC/result/test/preprocess/logs/coverm_map_pe.test.err
[Fri Apr 12 16:34:32 2024] rule sample_tsv: output: /data/home/wangbo/XTC/result/test/preprocess/koverage.samples.tsv jobid: 4 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/koverage.samples.tsv resources: tmpdir=/tmp
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda/fce0923fcb6faed0d66c7fb8e7a9e927_ Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/config.yaml is extended by additional config specified via the command line. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/system_config.yaml is extended by additional config specified via the command line. Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 8 Rules claiming more threads will be scaled down. Select jobs to execute... [Fri Apr 12 16:34:45 2024] Finished job 3. 1 of 5 steps (20%) done Select jobs to execute...
[Fri Apr 12 16:34:45 2024] rule coverm_bam2counts: input: /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam output: /data/home/wangbo/XTC/result/test/preprocess/temp/test.cov log: /data/home/wangbo/XTC/result/test/preprocess/logs/coverm_bam2counts.test.err jobid: 2 benchmark: /data/home/wangbo/XTC/result/test/preprocess/benchmarks/coverm_bam2counts.test.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/temp/test.cov; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam wildcards: sample=test resources: tmpdir=/tmp
coverm contig -b /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam -m count -m rpkm -m tpm -m mean -m covered_fraction -m variance > /data/home/wangbo/XTC/result/test/preprocess/temp/test.cov 2> /data/home/wangbo/XTC/result/test/preprocess/logs/coverm_bam2counts.test.err Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda/2b6eada0b2671d4e5851244fb2f8d89b_ [Fri Apr 12 16:34:46 2024] Finished job 4. 2 of 5 steps (40%) done [Fri Apr 12 16:34:46 2024] Finished job 2. 3 of 5 steps (60%) done Select jobs to execute...
[Fri Apr 12 16:34:46 2024] rule coverm_combine: input: /data/home/wangbo/XTC/result/test/preprocess/temp/test.cov output: /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv jobid: 1 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/temp/test.cov resources: tmpdir=/tmp
Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/config.yaml is extended by additional config specified via the command line. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/system_config.yaml is extended by additional config specified via the command line. Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 8 Rules claiming more threads will be scaled down. Select jobs to execute... [Fri Apr 12 16:34:48 2024] Finished job 1. 4 of 5 steps (80%) done Select jobs to execute...
[Fri Apr 12 16:34:48 2024] localrule coverm: input: /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/koverage.samples.tsv jobid: 0 reason: Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/koverage.samples.tsv resources: tmpdir=/tmp
[Fri Apr 12 16:34:48 2024] Finished job 0. 5 of 5 steps (100%) done Complete log: .snakemake/log/2024-04-12T163416.591559.snakemake.log cat .snakemake/log/2024-04-12T163416.591559.snakemake.log >> /data/home/wangbo/XTC/result/test/preprocess/koverage.log [2024:04:12 16:34:49] Snakemake finished successfully [Fri Apr 12 16:34:49 2024] Finished job 2. 3 of 11 steps (27%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:34:49 2024] localrule scan_phrogs: input: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../../databases/phrogs_mmseqs_db/phrogs_profile_db output: /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv log: /data/home/wangbo/XTC/result/test/logs/phrogs_scan.log jobid: 6 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta threads: 8 resources: tmpdir=/tmp, mem_mb=15259, mem_mib=15259, mem=16000MB
mkdir -p /data/home/wangbo/XTC/result/test/preprocess/phrogs
mmseqs createdb /data/home/wangbo/XTC/result/test/preprocess/edges.fasta /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../../databases/phrogs_mmseqs_db/phrogs_profile_db /data/home/wangbo/XTC/result/test/preprocess/phrogs/target_seq > /data/home/wangbo/XTC/result/test/logs/phrogs_scan.log
mmseqs search /data/home/wangbo/XTC/result/test/preprocess/phrogs/target_seq /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../../databases/phrogs_mmseqs_db/phrogs_profile_db /data/home/wangbo/XTC/result/test/preprocess/phrogs/results_mmseqs /data/home/wangbo/XTC/result/test/preprocess/phrogs/tmp --threads 8 -s 7 > /data/home/wangbo/XTC/result/test/logs/phrogs_scan.log
mmseqs createtsv /data/home/wangbo/XTC/result/test/preprocess/phrogs/target_seq /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../../databases/phrogs_mmseqs_db/phrogs_profile_db /data/home/wangbo/XTC/result/test/preprocess/phrogs/results_mmseqs /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv --threads 8 --full-header > /data/home/wangbo/XTC/result/test/logs/phrogs_scan.log
rm -rf /data/home/wangbo/XTC/result/test/preprocess/phrogs
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/d873faecbf16136ae736110f2de1b018_ [Fri Apr 12 16:38:07 2024] Finished job 6. 4 of 11 steps (36%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:38:07 2024] localrule scan_smg: input: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../../databases/marker.hmm output: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout log: /data/home/wangbo/XTC/result/test/logs/smg_scan_frag_out.log, /data/home/wangbo/XTC/result/test/logs/smg_scan_frag_err.log, /data/home/wangbo/XTC/result/test/logs/smg_scan_hmm_out.log, /data/home/wangbo/XTC/result/test/logs/smg_scan_hmm_err.log jobid: 5 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta threads: 8 resources: tmpdir=/tmp, mem_mb=15259, mem_mib=15259, mem=16000MB
run_FragGeneScan.pl -genome=/data/home/wangbo/XTC/result/test/preprocess/edges.fasta -out=/data/home/wangbo/XTC/result/test/preprocess/edges.fasta.frag -complete=0 -train=complete -thread=8 1>/data/home/wangbo/XTC/result/test/logs/smg_scan_frag_out.log 2>/data/home/wangbo/XTC/result/test/logs/smg_scan_frag_err.log
hmmsearch --domtblout /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout --cut_tc --cpu 8 /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/../../databases/marker.hmm /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.frag.faa 1>/data/home/wangbo/XTC/result/test/logs/smg_scan_hmm_out.log 2> /data/home/wangbo/XTC/result/test/logs/smg_scan_hmm_err.log
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/7052f94b534705e142372ab214aafe67_ [Fri Apr 12 16:38:10 2024] Finished job 5. 5 of 11 steps (45%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:38:10 2024] localrule run_combine_cov: input: /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv output: /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv jobid: 4 reason: Missing output files: /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv resources: tmpdir=/tmp
sed -i '1d' /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv
awk -F ' ' '{ sum[$2] += $6 } END { for (key in sum) print key, sum[key] }' /data/home/wangbo/XTC/result/test/preprocess/results/sample_coverm_coverage.tsv > /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv
[Fri Apr 12 16:38:10 2024] Finished job 4. 6 of 11 steps (55%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:38:10 2024] localrule run_phables: input: /data/home/wangbo/XTC/fga/test/assembly_graph_with_scaffolds.gfa, /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam.bai, /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout, /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv output: /data/home/wangbo/XTC/result/test/phables/resolved_paths.fasta, /data/home/wangbo/XTC/result/test/phables/resolved_phages, /data/home/wangbo/XTC/result/test/phables/resolved_genome_info.txt, /data/home/wangbo/XTC/result/test/phables/resolved_edges.fasta, /data/home/wangbo/XTC/result/test/phables/resolved_component_info.txt, /data/home/wangbo/XTC/result/test/phables/component_phrogs.txt, /data/home/wangbo/XTC/result/test/phables/unresolved_phage_like_edges.fasta log: /data/home/wangbo/XTC/result/test/logs/phables_output.log jobid: 7 reason: Missing output files: /data/home/wangbo/XTC/result/test/phables/component_phrogs.txt, /data/home/wangbo/XTC/result/test/phables/resolved_paths.fasta, /data/home/wangbo/XTC/result/test/phables/unresolved_phage_like_edges.fasta, /data/home/wangbo/XTC/result/test/phables/resolved_genome_info.txt, /data/home/wangbo/XTC/result/test/phables/resolved_component_info.txt; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout, /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam.bai threads: 8 resources: tmpdir=/tmp
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/873ad7e71dfcd9bb5adf1eb8f8dc2377_ Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/873ad7e71dfcd9bb5adf1eb8f8dc2377_ 2024-04-12 16:38:25,399 - INFO - Welcome to Phables: from fragmented assemblies to high-quality bacteriophage genomes. 2024-04-12 16:38:25,401 - INFO - Input arguments: 2024-04-12 16:38:25,401 - INFO - Assembly graph file: /data/home/wangbo/XTC/fga/test/assembly_graph_with_scaffolds.gfa 2024-04-12 16:38:25,401 - INFO - Unitig coverage file: /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv 2024-04-12 16:38:25,401 - INFO - BAM files path: /data/home/wangbo/XTC/result/test/preprocess/temp 2024-04-12 16:38:25,401 - INFO - Unitig .hmmout file: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout 2024-04-12 16:38:25,401 - INFO - Unitig phrog annotations file: /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv 2024-04-12 16:38:25,402 - INFO - Minimum length of unitigs to consider: 2000 2024-04-12 16:38:25,402 - INFO - Minimum coverage of paths to output: 10 2024-04-12 16:38:25,402 - INFO - Minimum unitig count to consider a component: 200 2024-04-12 16:38:25,402 - INFO - Maximum number of paths to resolve for a component: 10 2024-04-12 16:38:25,402 - INFO - Length threshold to consider single copy marker genes: 0.2 2024-04-12 16:38:25,402 - INFO - Maximum e-value for phrog annotations: 1e-10 2024-04-12 16:38:25,403 - INFO - Minimum sequence identity for phrog annotations: 0.3 2024-04-12 16:38:25,403 - INFO - Coverage tolerance for extending subpaths: 100.0 2024-04-12 16:38:25,403 - INFO - Coverage multipler for flow interval modelling: 1.2 2024-04-12 16:38:25,403 - INFO - Input long reads: False 2024-04-12 16:38:25,403 - INFO - Prefix for genome identifiers: None 2024-04-12 16:38:25,403 - INFO - Number of threads to use: 8 2024-04-12 16:38:25,403 - INFO - Output folder: /data/home/wangbo/XTC/result/test/phables 2024-04-12 16:38:25,423 - INFO - Total number of vertices in the assembly graph: 46 2024-04-12 16:38:25,424 - INFO - Total number of links in the assembly graph: 57 2024-04-12 16:38:25,659 - INFO - Total number of components found: 3 2024-04-12 16:38:25,659 - INFO - Short reads provided Resolving components: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 3/3 [00:00<00:00, 3.10it/s] 2024-04-12 16:38:27,334 - INFO - Total number of cyclic components found: 1 2024-04-12 16:38:27,334 - INFO - Total number of cyclic components resolved: 0 2024-04-12 16:38:27,334 - INFO - Single unitigs identified: 2 2024-04-12 16:38:27,334 - INFO - Total number of linear components found: 0 2024-04-12 16:38:27,334 - INFO - Total number of linear components resolved: 0 2024-04-12 16:38:27,334 - INFO - Total number of cyclic components found including single unitigs: 3 2024-04-12 16:38:27,334 - INFO - Total number of components resolved: 2 2024-04-12 16:38:27,334 - INFO - Case 1 (resolved/found): 2/2 2024-04-12 16:38:27,334 - INFO - Case 2 (resolved/found): 0/0 2024-04-12 16:38:27,334 - INFO - Case 3 (resolved/found): 0/1 2024-04-12 16:38:27,334 - INFO - Total number of genomes resolved: 2 2024-04-12 16:38:27,335 - INFO - Resolved genomes can be found in /data/home/wangbo/XTC/result/test/phables/resolved_paths.fasta 2024-04-12 16:38:27,346 - INFO - Resolved genome information can be found in /data/home/wangbo/XTC/result/test/phables/resolved_genome_info.txt 2024-04-12 16:38:27,349 - INFO - PHROGs found in resolved components can be found in /data/home/wangbo/XTC/result/test/phables/component_phrogs.txt 2024-04-12 16:38:27,349 - INFO - Elapsed time: 1.9460010528564453 seconds 2024-04-12 16:38:27,350 - INFO - Thank you for using Phables! [Fri Apr 12 16:38:27 2024] Finished job 7. 7 of 11 steps (64%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:38:27 2024] localrule combine_genomes_and_unresolved_edges: input: /data/home/wangbo/XTC/result/test/phables/resolved_paths.fasta, /data/home/wangbo/XTC/result/test/phables/unresolved_phage_like_edges.fasta output: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta jobid: 10 reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta; Input files updated by another job: /data/home/wangbo/XTC/result/test/phables/unresolved_phage_like_edges.fasta, /data/home/wangbo/XTC/result/test/phables/resolved_paths.fasta resources: tmpdir=/tmp
cat /data/home/wangbo/XTC/result/test/phables/resolved_paths.fasta /data/home/wangbo/XTC/result/test/phables/unresolved_phage_like_edges.fasta > /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta
[Fri Apr 12 16:38:27 2024] Finished job 10. 8 of 11 steps (73%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:38:27 2024] localrule koverage_genomes: input: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta output: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv jobid: 9 reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta threads: 8 resources: tmpdir=/tmp, mem_mb=15259, mem_mib=15259, mem=16000MB
koverage run --no-report --reads /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv --ref /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta --threads 8 --output /data/home/wangbo/XTC/result/test/postprocess
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_
βββ βββ βββββββ βββ ββββββββββββββββββ ββββββ βββββββ ββββββββ
βββ ββββββββββββββββ βββββββββββββββββββββββββββββββββββ ββββββββ
βββββββ βββ ββββββ βββββββββ βββββββββββββββββββ ββββββββββ
βββββββ βββ βββββββ ββββββββββ βββββββββββββββββββ βββββββββ
βββ ββββββββββββ βββββββ βββββββββββ ββββββ ββββββββββββββββββββ
βββ βββ βββββββ βββββ βββββββββββ ββββββ βββ βββββββ ββββββββ
[2024:04:12 16:38:28] Copying system default config to /data/home/wangbo/XTC/result/test/postprocess/koverage.config.yaml [2024:04:12 16:38:28] Updating config file with new values [2024:04:12 16:38:28] Writing config file to /data/home/wangbo/XTC/result/test/postprocess/koverage.config.yaml [2024:04:12 16:38:28] ------------------ [2024:04:12 16:38:28] | Runtime config | [2024:04:12 16:38:28] ------------------
koverage: args: bin_width: 100 conda_prefix: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda configfile: /data/home/wangbo/XTC/result/test/postprocess/koverage.config.yaml kmer_max: 5000 kmer_min: 50 kmer_sample: 100 kmer_size: 25 log: /data/home/wangbo/XTC/result/test/postprocess/koverage.log minimap: sr output: /data/home/wangbo/XTC/result/test/postprocess pafs: false reads: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv ref: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta report: false report_max_ctg: 1000 snake_args: [] system_config: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/config/config.yaml system_workflow_profile: /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/config/profile/config.yaml threads: 8 use_conda: true workflow_profile: /data/home/wangbo/XTC/result/test/postprocess/koverage.profile params: coverm: -m count -m rpkm -m tpm -m mean -m covered_fraction -m variance jellyfish: -C -s 1G -c 2 --out-counter-len=2 -L 2 resources: med: cpu: 4 mem: 16000 time: 02:00:00 ram: cpu: 2 mem: 8000 time: 04:00:00
[2024:04:12 16:38:28] Copying system default config to /data/home/wangbo/XTC/result/test/postprocess/koverage.profile/config.yaml [2024:04:12 16:38:28] --------------------- [2024:04:12 16:38:28] | Snakemake command | [2024:04:12 16:38:28] ---------------------
snakemake -s /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/Snakefile --configfile /data/home/wangbo/XTC/result/test/postprocess/koverage.config.yaml --cores 8 --use-conda --conda-prefix /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda --workflow-profile /data/home/wangbo/XTC/result/test/postprocess/koverage.profile Using profile /data/home/wangbo/XTC/result/test/postprocess/koverage.profile and workflow specific profile /data/home/wangbo/XTC/result/test/postprocess/koverage.profile for setting default command line arguments. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/config.yaml is extended by additional config specified via the command line. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/system_config.yaml is extended by additional config specified via the command line. Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 8 Rules claiming more threads will be scaled down. Job stats: job count
all_sample_coverage 1 combine_coverage 1 faidx_ref 1 idx_ref 1 map 1 raw_coverage 1 sample_coverage 1 sample_tsv 1 total 8
Select jobs to execute...
[Fri Apr 12 16:38:32 2024] rule idx_ref: input: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta output: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.idx log: /data/home/wangbo/XTC/result/test/postprocess/logs/idx_ref.err jobid: 4 benchmark: /data/home/wangbo/XTC/result/test/postprocess/benchmarks/idx_ref.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.idx threads: 4 resources: tmpdir=/tmp, mem_mb=16000, mem_mib=15259, mem=16000MB, time=02:00:00
awk 'BEGIN {count=-1} /^>/ { $0 = ">" ++count } 1' /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta | minimap2 -t 4 -d /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.idx - 2> /data/home/wangbo/XTC/result/test/postprocess/logs/idx_ref.err
[Fri Apr 12 16:38:32 2024] rule faidx_ref: input: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta output: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.fai log: /data/home/wangbo/XTC/result/test/postprocess/logs/faidx_ref.err jobid: 5 benchmark: /data/home/wangbo/XTC/result/test/postprocess/benchmarks/faidx_ref.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.fai resources: tmpdir=/tmp
samtools faidx /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta 2> /data/home/wangbo/XTC/result/test/postprocess/logs/faidx_ref.err Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda/fce0923fcb6faed0d66c7fb8e7a9e927_ Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda/fce0923fcb6faed0d66c7fb8e7a9e927_
[Fri Apr 12 16:38:32 2024] rule sample_tsv: output: /data/home/wangbo/XTC/result/test/postprocess/koverage.samples.tsv jobid: 7 reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/koverage.samples.tsv resources: tmpdir=/tmp
[Fri Apr 12 16:38:33 2024] Finished job 4. 1 of 8 steps (12%) done [Fri Apr 12 16:38:33 2024] Finished job 5. 2 of 8 steps (25%) done Select jobs to execute...
[Fri Apr 12 16:38:33 2024] rule raw_coverage: input: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.idx, /data/home/wangbo/XTC/data/test/test_1.fq, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.fai output: /data/home/wangbo/XTC/result/test/postprocess/temp/test.counts.pkl log: /data/home/wangbo/XTC/result/test/postprocess/logs/raw_coverage.test.err jobid: 3 benchmark: /data/home/wangbo/XTC/result/test/postprocess/benchmarks/raw_coverage.test.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/temp/test.counts.pkl; Input files updated by another job: /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.idx, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.fai wildcards: sample=test threads: 4 resources: tmpdir=/tmp, mem_mb=16000, mem_mib=15259, mem=16000MB, time=02:00:00
Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/config.yaml is extended by additional config specified via the command line. Config file /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/../config/system_config.yaml is extended by additional config specified via the command line. Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 8 Rules claiming more threads will be scaled down. Select jobs to execute... /data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/bin/python3.11 /data/home/wangbo/.snakemake/scripts/tmpugw_b71y.minimapWrapper.py Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/lib/python3.11/site-packages/koverage/workflow/conda/fce0923fcb6faed0d66c7fb8e7a9e927_ [Fri Apr 12 16:38:35 2024] Finished job 7. 3 of 8 steps (38%) done [Fri Apr 12 16:38:39 2024] Finished job 3. 4 of 8 steps (50%) done Select jobs to execute...
[Fri Apr 12 16:38:39 2024] rule sample_coverage: input: /data/home/wangbo/XTC/result/test/postprocess/temp/test.counts.pkl output: /data/home/wangbo/XTC/result/test/postprocess/temp/test.cov.tsv log: /data/home/wangbo/XTC/result/test/postprocess/logs/sample_coverage.test.err jobid: 2 benchmark: /data/home/wangbo/XTC/result/test/postprocess/benchmarks/sample_coverage.test.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/temp/test.cov.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/postprocess/temp/test.counts.pkl wildcards: sample=test resources: tmpdir=/tmp
/data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/bin/python3.11 /data/home/wangbo/.snakemake/scripts/tmp1gwbri6f.sampleCoverage.py [Fri Apr 12 16:38:39 2024] Finished job 2. 5 of 8 steps (62%) done Removing temporary output /data/home/wangbo/XTC/result/test/postprocess/temp/test.counts.pkl. Select jobs to execute...
[Fri Apr 12 16:38:39 2024] rule all_sample_coverage: input: /data/home/wangbo/XTC/result/test/postprocess/temp/test.cov.tsv output: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv log: /data/home/wangbo/XTC/result/test/postprocess/logs/all_sample_coverage.err jobid: 1 benchmark: /data/home/wangbo/XTC/result/test/postprocess/benchmarks/all_sample_coverage.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/postprocess/temp/test.cov.tsv resources: tmpdir=/tmp
printf 'Sample Contig Count RPM RPKM RPK TPM Mean Median Hitrate Variance ' > /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv 2> /data/home/wangbo/XTC/result/test/postprocess/logs/all_sample_coverage.err; cat /data/home/wangbo/XTC/result/test/postprocess/temp/test.cov.tsv >> /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv 2> /data/home/wangbo/XTC/result/test/postprocess/logs/all_sample_coverage.err Benchmark: unable to collect cpu and memory benchmark statistics [Fri Apr 12 16:38:39 2024] Finished job 1. 6 of 8 steps (75%) done Removing temporary output /data/home/wangbo/XTC/result/test/postprocess/temp/test.cov.tsv. Select jobs to execute...
[Fri Apr 12 16:38:39 2024] rule combine_coverage: input: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.fai output: /data/home/wangbo/XTC/result/test/postprocess/results/all_coverage.tsv log: /data/home/wangbo/XTC/result/test/postprocess/logs/combine_coverage.err jobid: 6 benchmark: /data/home/wangbo/XTC/result/test/postprocess/benchmarks/combine_coverage.txt reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/results/all_coverage.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta.fai resources: tmpdir=/tmp
/data/home/wangbo/miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/ca3b7425db9120b12d2fee9bdfbd3ddc_/bin/python3.11 /data/home/wangbo/.snakemake/scripts/tmpns1unuw_.combineCoverage.py [Fri Apr 12 16:38:40 2024] Finished job 6. 7 of 8 steps (88%) done Select jobs to execute...
[Fri Apr 12 16:38:40 2024] localrule map: input: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv, /data/home/wangbo/XTC/result/test/postprocess/results/all_coverage.tsv, /data/home/wangbo/XTC/result/test/postprocess/koverage.samples.tsv jobid: 0 reason: Input files updated by another job: /data/home/wangbo/XTC/result/test/postprocess/results/all_coverage.tsv, /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv, /data/home/wangbo/XTC/result/test/postprocess/koverage.samples.tsv resources: tmpdir=/tmp
[Fri Apr 12 16:38:40 2024] Finished job 0. 8 of 8 steps (100%) done Complete log: .snakemake/log/2024-04-12T163828.648781.snakemake.log cat .snakemake/log/2024-04-12T163828.648781.snakemake.log >> /data/home/wangbo/XTC/result/test/postprocess/koverage.log [2024:04:12 16:38:40] Snakemake finished successfully [Fri Apr 12 16:38:40 2024] Finished job 9. 9 of 11 steps (82%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:38:40 2024] localrule koverage_postprocess: input: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta output: /data/home/wangbo/XTC/result/test/postprocess/sample_genome_read_counts.tsv log: /data/home/wangbo/XTC/result/test/logs/format_koverage_results_output.log jobid: 8 reason: Missing output files: /data/home/wangbo/XTC/result/test/postprocess/sample_genome_read_counts.tsv; Input files updated by another job: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv, /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges.fasta resources: tmpdir=/tmp
Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/873ad7e71dfcd9bb5adf1eb8f8dc2377_ Activating conda environment: miniconda3/envs/phables/lib/python3.11/site-packages/phables/workflow/conda/873ad7e71dfcd9bb5adf1eb8f8dc2377_ 2024-04-12 16:38:47,662 - INFO - Samples file: /data/home/wangbo/XTC/result/test/preprocess/phables.samples.tsv 2024-04-12 16:38:47,663 - INFO - Koverage results: /data/home/wangbo/XTC/result/test/postprocess/results/sample_coverage.tsv 2024-04-12 16:38:47,663 - INFO - Output path: /data/home/wangbo/XTC/result/test/postprocess/ 2024-04-12 16:38:47,707 - INFO - Raw read counts mapped to resolved genomes can be found in /data/home/wangbo/XTC/result/test/postprocess/sample_genome_read_counts.tsv 2024-04-12 16:38:47,707 - INFO - RPKM values of resolved genomes can be found in /data/home/wangbo/XTC/result/test/postprocess/sample_genome_rpkm.tsv 2024-04-12 16:38:47,707 - INFO - Estimated mean read depth of resolved genomes can be found in /data/home/wangbo/XTC/result/test/postprocess/sample_genome_mean_coverage.tsv 2024-04-12 16:38:47,732 - INFO - Sequence information file can be found in /data/home/wangbo/XTC/result/test/postprocess/genomes_and_unresolved_edges_info.tsv 2024-04-12 16:38:47,732 - INFO - Thank you for using format_koverage_results! [Fri Apr 12 16:38:47 2024] Finished job 8. 10 of 11 steps (91%) done Select jobs to execute... Execute 1 jobs...
[Fri Apr 12 16:38:47 2024] localrule all: input: /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam.bai, /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout, /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv, /data/home/wangbo/XTC/result/test/phables/resolved_genome_info.txt, /data/home/wangbo/XTC/result/test/phables/resolved_component_info.txt, /data/home/wangbo/XTC/result/test/phables/component_phrogs.txt, /data/home/wangbo/XTC/result/test/postprocess/sample_genome_read_counts.tsv jobid: 0 reason: Input files updated by another job: /data/home/wangbo/XTC/result/test/phables/component_phrogs.txt, /data/home/wangbo/XTC/result/test/preprocess/phrogs_annotations.tsv, /data/home/wangbo/XTC/result/test/preprocess/coverage.tsv, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta.hmmout, /data/home/wangbo/XTC/result/test/phables/resolved_genome_info.txt, /data/home/wangbo/XTC/result/test/preprocess/edges.fasta, /data/home/wangbo/XTC/result/test/phables/resolved_component_info.txt, /data/home/wangbo/XTC/result/test/postprocess/sample_genome_read_counts.tsv, /data/home/wangbo/XTC/result/test/preprocess/temp/test.bam.bai resources: tmpdir=/tmp
[Fri Apr 12 16:38:47 2024] Finished job 0. 11 of 11 steps (100%) done Complete log: .snakemake/log/2024-04-12T163350.806415.snakemake.log
Phables ran successfully!
[2024:04:12 16:38:47] Snakemake finished successfully `
What's more, I tested real datasets(NCBI BioProject number PRJNA756429, PRJNA866269, PRJNA434744), still, no case3 is resolved. I used phables version 1.3.2 and my operating system is Linux. I don't think that's the way it should be, and I really want to solve this problem.
I downloaded from https://zenodo.org/record/8137197 simPhage.zip, it did run successfully, I use "Bandage" compared the two assembly graph, found that they are very similar, what caused my running failure? Is it because I use metaSPAdes?
Hi @xtc2002,
Thanks for your interest in Phables.
Have you properly installed Gurobi and set up the academic license? Case 3 wonβt be resolved if Gurobi is not properly set up.
As for your simulated dataset, can you please share with me the assembly graph file, read data and the Phables log file (found in logs folder)? I can have a look.
Using metaSPAdes should not be the issue (you should since itβs a metagenome). Also, what values of k did you use for assembly?
Hi @Vini2 ,
I downloaded 'simPhage' dataset from https://zenodo.org/record/8137197 , and in this case, Case 3 was resolved very well, so maybe itβs not an issue with Gurobi.
Assembly graph and reads are here.
assembly_graph.zip
reads.zip
logs
logs.zip
It seems like MetaSPAdes used K55, K33, and K21 and integrated their results.
What's more, I ran Phables with the assembly graph generated by hecatomb and it didn't solve case3 properly either.
ThanksοΌ
Hi, I'm xtc2002, I'm sorry to bother you, but I'm really confused. I cannot upload files larger than 25M on GitHub, so I uploaded the file to the following URL. Now let me describe the situation. I downloaded 'simPhage' dataset from https://zenodo.org/record/8137197, I ran Phables with assembly_graph_after_simplification.gfa and reads provided by simPhage, Phables resolved Case 3 correctly (The results are saved in the phables.out folder). However, when I use hecatomb and metaSPAdes to produce assembly graphs, and use these assembly graph to run Phables, it didn't run correctly, no Case 3 is solved (The results are saved in the phables.outh and phables.outs folders, respectively). What's more, I have tried 'Wastewater' dataset and use its assembly graph (reads are from NCBI), also, no Case 3 is solved. I tried creating a new conda environment and install Phables and Gurobi again, but the results were the same, and I tried running a new environment in a virtual machine, and the results were the same. Where did it all go wrong? I really want to fix it. I would appreciate it if you could help me with my problem, thanks! http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=7c393765780029c83f40ff0e1362061f404d5354045a55001509525c514f50060c011a040153511d5a090206545157030a5d0707356d34435154670d5405511e4c58454b5218340d&code=897e5b40
Hi @xtc2002,
Thanks for sharing the data. I'll have a look and get back to you soon.
Can you please confirm the Gurobi version you are using?
Hi @Vini2 , Here is my Gurobi version information. Gurobi Optimizer version 11.0.1 build v11.0.1rc0 (linux64 - "CentOS Linux 7 (Core)") Copyright (c) 2024, Gurobi Optimization, LLC. Thanks!
Hi @xtc2002,
Sorry for the delay in getting back to you.
Were you able to solve this issue?
I checked the data you shared with me and the assembly is quite different from the original assembly I had. Simulation and assembly can cause these differences. Simulators can produce different reads covering the genome. Assembly algorithms can be non-deterministic and can result in differences. The assembly you produced has a self-loop (a repeat) in one of the contigs and there are no paired-end reads spanning across contigs (hence no subpaths). In such cases, phables cannot resolve genomes.
Since you are simulating reads of length 300bp, I would recommend using a larger k-size for assembly, something like -k 21,33,55,77,99,127, so some of those repeats can be resolved.
Closing this issue for now. Please re-open if needed.