iwc icon indicating copy to clipboard operation
iwc copied to clipboard

add taxonomic analysis and human reads removal wf

Open PlushZ opened this issue 1 year ago • 1 comments

@bebatut @wm75 This is workflow for Taxonomic Analysis of SARS-CoV-2 Wastewater Samples with Human Read Removal. Would be a first part of metagenomic data variant analysis

PlushZ avatar Apr 18 '23 02:04 PlushZ

Test Results (powered by Planemo)

Test Summary

Test State Count
Total 1
Passed 0
Error 1
Failure 0
Skipped 0
Errored Tests
  • ❌ Taxonomic-Analysis-of-SARS-CoV-2-Wastewater-Samples-with-Human-Read-Removal.ga_0

    Execution Problem:

    • Failed to run workflow, at least one job is in [error] state.
      

    Workflow invocation details

    • Invocation Messages

    • Steps
      • Step 1: SARS-CoV-2 reference genome:

        • step_state: scheduled
      • Step 2: Paired Collection:

        • step_state: scheduled
      • Step 3: toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.20.1+galaxy0:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • ln -s '/tmp/tmpvp1vnw0g/files/d/3/0/dataset_d30a72ce-4a42-4ef0-8b6c-48da70a66eec.dat' 'SRR12596170_fastq.fastq.gz' && ln -s '/tmp/tmpvp1vnw0g/files/4/5/7/dataset_4576cabe-e687-45d1-bb99-c0f24069cb39.dat' 'SRR12596170_fastq_R2.fastq.gz' &&    fastp  --thread ${GALAXY_SLOTS:-1} --report_title 'fastp report for SRR12596170_fastq.fastq.gz'   -i 'SRR12596170_fastq.fastq.gz' -o first.fastq.gz  -I 'SRR12596170_fastq_R2.fastq.gz' -O second.fastq.gz       --detect_adapter_for_pe                                          &&  mv first.fastq.gz '/tmp/tmpvp1vnw0g/job_working_directory/000/6/outputs/dataset_9b59d290-9dee-469a-8932-9966a0a4d674.dat' && mv second.fastq.gz '/tmp/tmpvp1vnw0g/job_working_directory/000/6/outputs/dataset_91e7a8e9-be16-4bb8-9811-c2fbacf66e9a.dat'
              

            Exit Code:

            • 0
              

            Standard Error:

            • Detecting adapter sequence for read1...
              No adapter detected for read1
              
              Detecting adapter sequence for read2...
              No adapter detected for read2
              
              Read1 before filtering:
              total reads: 167374
              total bases: 12425652
              Q20 bases: 11614277(93.4702%)
              Q30 bases: 11368920(91.4956%)
              
              Read2 before filtering:
              total reads: 167374
              total bases: 12265824
              Q20 bases: 11350710(92.5393%)
              Q30 bases: 11085935(90.3807%)
              
              Read1 after filtering:
              total reads: 155144
              total bases: 11496170
              Q20 bases: 10977759(95.4906%)
              Q30 bases: 10795701(93.9069%)
              
              Read2 aftering filtering:
              total reads: 155144
              total bases: 11348252
              Q20 bases: 10851896(95.6261%)
              Q30 bases: 10659366(93.9296%)
              
              Filtering result:
              reads passed filter: 310288
              reads failed due to low quality: 24088
              reads failed due to too many N: 372
              reads failed due to too short: 0
              reads with adapter trimmed: 1800
              bases trimmed due to adapters: 37303
              
              Duplication rate: 12.3851%
              
              Insert size peak (evaluated by paired-end reads): 116
              
              JSON report: fastp.json
              HTML report: fastp.html
              
              fastp --thread 1 --report_title fastp report for SRR12596170_fastq.fastq.gz -i SRR12596170_fastq.fastq.gz -o first.fastq.gz -I SRR12596170_fastq_R2.fastq.gz -O second.fastq.gz --detect_adapter_for_pe 
              fastp v0.20.1, time used: 7 seconds
              
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "fastqsanger.gz"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              filter_options {"length_filtering_options": {"disable_length_filtering": false, "length_limit": null, "length_required": null}, "low_complexity_filter": {"complexity_threshold": null, "enable_low_complexity_filter": false}, "quality_filtering_options": {"disable_quality_filtering": false, "n_base_limit": null, "qualified_quality_phred": null, "unqualified_percent_limit": null}}
              output_options {"report_html": true, "report_json": true}
              overrepresented_sequence_analysis {"overrepresentation_analysis": false, "overrepresentation_sampling": null}
              read_mod_options {"base_correction_options": {"correction": false}, "cutting_by_quality_options": {"cut_by_quality3": false, "cut_by_quality5": false, "cut_mean_quality": null, "cut_window_size": null}, "polyg_tail_trimming": {"__current_case__": 1, "poly_g_min_len": null, "trimming_select": ""}, "polyx_tail_trimming": {"__current_case__": 1, "polyx_trimming_select": ""}, "umi_processing": {"umi": false, "umi_len": null, "umi_loc": "", "umi_prefix": ""}}
              single_paired {"__current_case__": 2, "adapter_trimming_options": {"adapter_sequence1": "", "adapter_sequence2": "", "disable_adapter_trimming": false}, "global_trimming_options": {"trim_front1": null, "trim_front2": null, "trim_tail1": null, "trim_tail2": null}, "paired_input": {"values": [{"id": 1, "src": "dce"}]}, "single_paired_selector": "paired_collection"}
          • Job 2:

            • Job state is ok

            Command Line:

            • ln -s '/tmp/tmpvp1vnw0g/files/5/c/4/dataset_5c41e464-70bf-4ca3-ab35-b684ecf945fc.dat' 'SRR12596172_fastq.fastq.gz' && ln -s '/tmp/tmpvp1vnw0g/files/f/3/c/dataset_f3c2d737-1ea5-4bbe-a418-86cfe41c8882.dat' 'SRR12596172_fastq_R2.fastq.gz' &&    fastp  --thread ${GALAXY_SLOTS:-1} --report_title 'fastp report for SRR12596172_fastq.fastq.gz'   -i 'SRR12596172_fastq.fastq.gz' -o first.fastq.gz  -I 'SRR12596172_fastq_R2.fastq.gz' -O second.fastq.gz       --detect_adapter_for_pe                                          &&  mv first.fastq.gz '/tmp/tmpvp1vnw0g/job_working_directory/000/7/outputs/dataset_00406541-0045-4332-bc59-40fba66ebfb7.dat' && mv second.fastq.gz '/tmp/tmpvp1vnw0g/job_working_directory/000/7/outputs/dataset_28b581d3-871a-40f8-93dc-7ccd60fe556e.dat'
              

            Exit Code:

            • 0
              

            Standard Error:

            • Detecting adapter sequence for read1...
              No adapter detected for read1
              
              Detecting adapter sequence for read2...
              No adapter detected for read2
              
              Read1 before filtering:
              total reads: 167374
              total bases: 12425652
              Q20 bases: 11614277(93.4702%)
              Q30 bases: 11368920(91.4956%)
              
              Read2 before filtering:
              total reads: 167374
              total bases: 12265824
              Q20 bases: 11350710(92.5393%)
              Q30 bases: 11085935(90.3807%)
              
              Read1 after filtering:
              total reads: 155144
              total bases: 11496170
              Q20 bases: 10977759(95.4906%)
              Q30 bases: 10795701(93.9069%)
              
              Read2 aftering filtering:
              total reads: 155144
              total bases: 11348252
              Q20 bases: 10851896(95.6261%)
              Q30 bases: 10659366(93.9296%)
              
              Filtering result:
              reads passed filter: 310288
              reads failed due to low quality: 24088
              reads failed due to too many N: 372
              reads failed due to too short: 0
              reads with adapter trimmed: 1800
              bases trimmed due to adapters: 37303
              
              Duplication rate: 12.3851%
              
              Insert size peak (evaluated by paired-end reads): 116
              
              JSON report: fastp.json
              HTML report: fastp.html
              
              fastp --thread 1 --report_title fastp report for SRR12596172_fastq.fastq.gz -i SRR12596172_fastq.fastq.gz -o first.fastq.gz -I SRR12596172_fastq_R2.fastq.gz -O second.fastq.gz --detect_adapter_for_pe 
              fastp v0.20.1, time used: 7 seconds
              
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "fastqsanger.gz"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              filter_options {"length_filtering_options": {"disable_length_filtering": false, "length_limit": null, "length_required": null}, "low_complexity_filter": {"complexity_threshold": null, "enable_low_complexity_filter": false}, "quality_filtering_options": {"disable_quality_filtering": false, "n_base_limit": null, "qualified_quality_phred": null, "unqualified_percent_limit": null}}
              output_options {"report_html": true, "report_json": true}
              overrepresented_sequence_analysis {"overrepresentation_analysis": false, "overrepresentation_sampling": null}
              read_mod_options {"base_correction_options": {"correction": false}, "cutting_by_quality_options": {"cut_by_quality3": false, "cut_by_quality5": false, "cut_mean_quality": null, "cut_window_size": null}, "polyg_tail_trimming": {"__current_case__": 1, "poly_g_min_len": null, "trimming_select": ""}, "polyx_tail_trimming": {"__current_case__": 1, "polyx_trimming_select": ""}, "umi_processing": {"umi": false, "umi_len": null, "umi_loc": "", "umi_prefix": ""}}
              single_paired {"__current_case__": 2, "adapter_trimming_options": {"adapter_sequence1": "", "adapter_sequence2": "", "disable_adapter_trimming": false}, "global_trimming_options": {"trim_front1": null, "trim_front2": null, "trim_tail1": null, "trim_tail2": null}, "paired_input": {"values": [{"id": 4, "src": "dce"}]}, "single_paired_selector": "paired_collection"}
      • Step 4: toolshed.g2.bx.psu.edu/repos/iuc/kraken2/kraken2/2.1.1+galaxy1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • kraken2 --threads ${GALAXY_SLOTS:-1} --db '/cvmfs/data.galaxyproject.org/managed/kraken2_databases/kraken2_viral_db'    --paired '/tmp/tmpvp1vnw0g/files/d/3/0/dataset_d30a72ce-4a42-4ef0-8b6c-48da70a66eec.dat' '/tmp/tmpvp1vnw0g/files/4/5/7/dataset_4576cabe-e687-45d1-bb99-c0f24069cb39.dat'   --confidence '0.0' --minimum-base-quality '0' --minimum-hit-groups '2'    --report '/tmp/tmpvp1vnw0g/job_working_directory/000/8/outputs/dataset_bd438a34-cc3d-44c8-9c3d-ac96c62dec9b.dat'     > '/tmp/tmpvp1vnw0g/job_working_directory/000/8/outputs/dataset_75337e88-8056-4f07-97ce-0342825854d6.dat'
              

            Exit Code:

            • 0
              

            Standard Error:

            • Loading database information... done.
              167374 sequences (24.69 Mbp) processed in 1.468s (6841.7 Kseq/m, 1009.31 Mbp/m).
                5662 sequences classified (3.38%)
                161712 sequences unclassified (96.62%)
              
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "fastqsanger.gz"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              confidence "0.0"
              dbkey "?"
              kraken2_database "viral2019-03"
              min_base_quality "0"
              minimum_hit_groups "2"
              quick false
              report {"create_report": true, "report_minimizer_data": false, "report_zero_counts": false, "use_mpa_style": false}
              single_paired {"__current_case__": 0, "input_pair": {"values": [{"id": 1, "src": "dce"}]}, "single_paired_selector": "collection"}
              split_reads false
              use_names false
          • Job 2:

            • Job state is ok

            Command Line:

            • kraken2 --threads ${GALAXY_SLOTS:-1} --db '/cvmfs/data.galaxyproject.org/managed/kraken2_databases/kraken2_viral_db'    --paired '/tmp/tmpvp1vnw0g/files/5/c/4/dataset_5c41e464-70bf-4ca3-ab35-b684ecf945fc.dat' '/tmp/tmpvp1vnw0g/files/f/3/c/dataset_f3c2d737-1ea5-4bbe-a418-86cfe41c8882.dat'   --confidence '0.0' --minimum-base-quality '0' --minimum-hit-groups '2'    --report '/tmp/tmpvp1vnw0g/job_working_directory/000/9/outputs/dataset_87d986a5-c189-45a3-a5a0-5cfd386b4bf5.dat'     > '/tmp/tmpvp1vnw0g/job_working_directory/000/9/outputs/dataset_26b4a608-3368-413e-a783-758b86df10cc.dat'
              

            Exit Code:

            • 0
              

            Standard Error:

            • Loading database information... done.
              167374 sequences (24.69 Mbp) processed in 1.475s (6806.6 Kseq/m, 1004.13 Mbp/m).
                5662 sequences classified (3.38%)
                161712 sequences unclassified (96.62%)
              
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "fastqsanger.gz"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              confidence "0.0"
              dbkey "?"
              kraken2_database "viral2019-03"
              min_base_quality "0"
              minimum_hit_groups "2"
              quick false
              report {"create_report": true, "report_minimizer_data": false, "report_zero_counts": false, "use_mpa_style": false}
              single_paired {"__current_case__": 0, "input_pair": {"values": [{"id": 4, "src": "dce"}]}, "single_paired_selector": "collection"}
              split_reads false
              use_names false
      • Step 5: toolshed.g2.bx.psu.edu/repos/iuc/read_it_and_keep/read_it_and_keep/0.2.2+galaxy0:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • ln -s '/tmp/tmpvp1vnw0g/files/7/e/7/dataset_7e74610b-a616-4717-9b1c-dba7a2ba6168.dat' ref_untrimmed.fasta && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/read_it_and_keep/1563b58905f4/read_it_and_keep/trim_reference.py' ref_untrimmed.fasta ref.fasta && ln -s '/tmp/tmpvp1vnw0g/files/9/b/5/dataset_9b59d290-9dee-469a-8932-9966a0a4d674.dat' read1 && ln -s '/tmp/tmpvp1vnw0g/files/9/1/e/dataset_91e7a8e9-be16-4bb8-9811-c2fbacf66e9a.dat' read2 && readItAndKeep --tech illumina --ref_fasta ref.fasta --min_map_length 50 --min_map_length_pc 50.0  --reads1 read1 --reads2 read2 -o output
              

            Exit Code:

            • 0
              

            Standard Error:

            • Processed 100000 reads (or read pairs)
              
              

            Standard Output:

            • Input reads file 1	155144
              Input reads file 2	155144
              Kept reads 1	24959
              Kept reads 2	24959
              
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              adv {"enumerate_names": false, "min_map_length": "50", "min_map_length_pc": "50.0"}
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              reads {"__current_case__": 1, "paired_reads": {"values": [{"id": 7, "src": "dce"}]}, "read_type": "paired_collection"}
              ref_source {"__current_case__": 0, "ref_fasta": {"values": [{"id": 1, "src": "hda"}]}, "source": "history"}
              sequencing_tech "illumina"
              trim_reference true
          • Job 2:

            • Job state is ok

            Command Line:

            • ln -s '/tmp/tmpvp1vnw0g/files/7/e/7/dataset_7e74610b-a616-4717-9b1c-dba7a2ba6168.dat' ref_untrimmed.fasta && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/read_it_and_keep/1563b58905f4/read_it_and_keep/trim_reference.py' ref_untrimmed.fasta ref.fasta && ln -s '/tmp/tmpvp1vnw0g/files/0/0/4/dataset_00406541-0045-4332-bc59-40fba66ebfb7.dat' read1 && ln -s '/tmp/tmpvp1vnw0g/files/2/8/b/dataset_28b581d3-871a-40f8-93dc-7ccd60fe556e.dat' read2 && readItAndKeep --tech illumina --ref_fasta ref.fasta --min_map_length 50 --min_map_length_pc 50.0  --reads1 read1 --reads2 read2 -o output
              

            Exit Code:

            • 0
              

            Standard Error:

            • Processed 100000 reads (or read pairs)
              
              

            Standard Output:

            • Input reads file 1	155144
              Input reads file 2	155144
              Kept reads 1	24959
              Kept reads 2	24959
              
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              adv {"enumerate_names": false, "min_map_length": "50", "min_map_length_pc": "50.0"}
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              reads {"__current_case__": 1, "paired_reads": {"values": [{"id": 8, "src": "dce"}]}, "read_type": "paired_collection"}
              ref_source {"__current_case__": 0, "ref_fasta": {"values": [{"id": 1, "src": "hda"}]}, "source": "history"}
              sequencing_tech "illumina"
              trim_reference true
      • Step 6: toolshed.g2.bx.psu.edu/repos/devteam/kraken2tax/Kraken2Tax/1.1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is running

            Command Line:

            • awk '{ print $2, $3 }' OFS="\t" "/tmp/tmpvp1vnw0g/files/b/d/4/dataset_bd438a34-cc3d-44c8-9c3d-ac96c62dec9b.dat" | taxonomy-reader "/cvmfs/data.galaxyproject.org/managed/ncbi_taxonomy/ncbi-2015-10-05/names.dmp" "/cvmfs/data.galaxyproject.org/managed/ncbi_taxonomy/ncbi-2015-10-05/nodes.dmp" 1 > "/tmp/tmpvp1vnw0g/job_working_directory/000/12/outputs/dataset_61c4bc80-cc86-41a5-98f3-7577945e8bf7.dat"
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              ncbi_taxonomy "ncbi-2015-10-05"
              read_name "2"
              tax_id 5bc3b5997ae36e38
          • Job 2:

            • Job state is error

            Command Line:

            • awk '{ print $2, $3 }' OFS="\t" "/tmp/tmpvp1vnw0g/files/8/7/d/dataset_87d986a5-c189-45a3-a5a0-5cfd386b4bf5.dat" | taxonomy-reader "/cvmfs/data.galaxyproject.org/managed/ncbi_taxonomy/ncbi-2015-10-05/names.dmp" "/cvmfs/data.galaxyproject.org/managed/ncbi_taxonomy/ncbi-2015-10-05/nodes.dmp" 1 > "/tmp/tmpvp1vnw0g/job_working_directory/000/13/outputs/dataset_11ecc157-8193-468d-a152-8f217a33da1b.dat"
              

            Exit Code:

            • 127
              

            Standard Error:

            • /tmp/tmpvp1vnw0g/job_working_directory/000/13/tool_script.sh: line 9: taxonomy-reader: command not found
              
              

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              ncbi_taxonomy "ncbi-2015-10-05"
              read_name "2"
              tax_id 5bc3b5997ae36e38
      • Step 7: toolshed.g2.bx.psu.edu/repos/crs4/taxonomy_krona_chart/taxonomy_krona_chart/2.7.1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is paused

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "taxonomy"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              combine_inputs false
              dbkey "?"
              root_name "Root"
              type_of_data {"__current_case__": 0, "input": {"values": [{"id": 8, "src": "hdca"}]}, "max_rank": "8", "type_of_data_selector": "taxonomy"}
      • Step 8: toolshed.g2.bx.psu.edu/repos/crs4/taxonomy_krona_chart/taxonomy_krona_chart/2.7.1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is paused

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "taxonomy"
              __workflow_invocation_uuid__ "4dc0b42fdb4311eeaad6fbf0711a9142"
              chromInfo "/tmp/tmpvp1vnw0g/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              combine_inputs true
              dbkey "?"
              root_name "Root"
              type_of_data {"__current_case__": 0, "input": {"values": [{"id": 8, "src": "hdca"}]}, "max_rank": "8", "type_of_data_selector": "taxonomy"}
    • Other invocation details
      • error_message

        • Failed to run workflow, at least one job is in [error] state.
      • history_id

        • b718e6261924da2c
      • history_state

        • error
      • invocation_id

        • b718e6261924da2c
      • invocation_state

        • scheduled
      • workflow_id

        • b718e6261924da2c

github-actions[bot] avatar Mar 05 '24 23:03 github-actions[bot]