Sarek bcftools normalization
PR checklist
- [x] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [x] If you've added a new tool - have you followed the pipeline conventions in the contribution docs
- [ ] If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
- [ ] Make sure your code lints (
nf-core lint). - [ ] Ensure the test suite passes (
nextflow run . -profile test,docker --outdir <OUTDIR>). - [ ] Check for unexpected warnings in debug mode (
nextflow run . -profile debug,test,docker --outdir <OUTDIR>). - [ ] Usage Documentation in
docs/usage.mdis updated. - [ ] Output Documentation in
docs/output.mdis updated. - [x]
CHANGELOG.mdis updated. - [ ]
README.mdis updated (including new tool citations and authors/contributors).
Hi all,
I've modified the normalization step to include all VCFs, not just the germline ones. For this, I used the pull request from JC-Delmas as a base. I am aware that this still requires a lot of work, and I would greatly appreciate any advice or feedback you can provide.
Thank you!
Patricie
nf-core pipelines lint overall result: Passed :white_check_mark: :warning:
Posted for pipeline commit 6c39a63
+| ✅ 215 tests passed |+
#| ❔ 11 tests were ignored |#
!| ❗ 4 tests had warnings |!
:heavy_exclamation_mark: Test warnings:
-
pipeline_todos - TODO string in
main.nf: Optionally add in-text citation tools to this list. -
pipeline_todos - TODO string in
main.nf: Optionally add bibliographic entries to this list. -
pipeline_todos - TODO string in
main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled! -
pipeline_todos - TODO string in
base.config: Check the defaults for all processes
:grey_question: Tests ignored:
-
files_exist - File is ignored:
.github/workflows/awsfulltest.yml -
files_exist - File is ignored:
.github/workflows/awstest.yml -
files_exist - File is ignored:
conf/modules.config -
files_unchanged - File ignored due to lint config:
assets/nf-core-sarek_logo_light.png -
files_unchanged - File ignored due to lint config:
docs/images/nf-core-sarek_logo_light.png -
files_unchanged - File ignored due to lint config:
docs/images/nf-core-sarek_logo_dark.png -
files_unchanged - File ignored due to lint config:
.gitignoreor.prettierignore - actions_ci - actions_ci
-
actions_awstest - 'awstest.yml' workflow not found:
/home/runner/work/sarek/sarek/.github/workflows/awstest.yml - template_strings - template_strings
- modules_config - modules_config
:white_check_mark: Tests passed:
-
files_exist - File found:
.gitattributes -
files_exist - File found:
.gitignore -
files_exist - File found:
.nf-core.yml -
files_exist - File found:
.editorconfig -
files_exist - File found:
.prettierignore -
files_exist - File found:
.prettierrc.yml -
files_exist - File found:
CHANGELOG.md -
files_exist - File found:
CITATIONS.md -
files_exist - File found:
CODE_OF_CONDUCT.md -
files_exist - File found:
LICENSEorLICENSE.mdorLICENCEorLICENCE.md -
files_exist - File found:
nextflow_schema.json -
files_exist - File found:
nextflow.config -
files_exist - File found:
README.md -
files_exist - File found:
.github/.dockstore.yml -
files_exist - File found:
.github/CONTRIBUTING.md -
files_exist - File found:
.github/ISSUE_TEMPLATE/bug_report.yml -
files_exist - File found:
.github/ISSUE_TEMPLATE/config.yml -
files_exist - File found:
.github/ISSUE_TEMPLATE/feature_request.yml -
files_exist - File found:
.github/PULL_REQUEST_TEMPLATE.md -
files_exist - File found:
.github/workflows/branch.yml -
files_exist - File found:
.github/workflows/ci.yml -
files_exist - File found:
.github/workflows/linting_comment.yml -
files_exist - File found:
.github/workflows/linting.yml -
files_exist - File found:
assets/email_template.html -
files_exist - File found:
assets/email_template.txt -
files_exist - File found:
assets/sendmail_template.txt -
files_exist - File found:
assets/nf-core-sarek_logo_light.png -
files_exist - File found:
conf/test.config -
files_exist - File found:
conf/test_full.config -
files_exist - File found:
docs/images/nf-core-sarek_logo_light.png -
files_exist - File found:
docs/images/nf-core-sarek_logo_dark.png -
files_exist - File found:
docs/output.md -
files_exist - File found:
docs/README.md -
files_exist - File found:
docs/README.md -
files_exist - File found:
docs/usage.md -
files_exist - File found:
main.nf -
files_exist - File found:
assets/multiqc_config.yml -
files_exist - File found:
conf/base.config -
files_exist - File found:
conf/igenomes.config -
files_exist - File found:
conf/igenomes_ignored.config -
files_exist - File found:
modules.json -
files_exist - File not found check:
.github/ISSUE_TEMPLATE/bug_report.md -
files_exist - File not found check:
.github/ISSUE_TEMPLATE/feature_request.md -
files_exist - File not found check:
.github/workflows/push_dockerhub.yml -
files_exist - File not found check:
.markdownlint.yml -
files_exist - File not found check:
.nf-core.yaml -
files_exist - File not found check:
.yamllint.yml -
files_exist - File not found check:
bin/markdown_to_html.r -
files_exist - File not found check:
conf/aws.config -
files_exist - File not found check:
docs/images/nf-core-sarek_logo.png -
files_exist - File not found check:
lib/Checks.groovy -
files_exist - File not found check:
lib/Completion.groovy -
files_exist - File not found check:
lib/NfcoreTemplate.groovy -
files_exist - File not found check:
lib/Utils.groovy -
files_exist - File not found check:
lib/Workflow.groovy -
files_exist - File not found check:
lib/WorkflowMain.groovy -
files_exist - File not found check:
lib/WorkflowSarek.groovy -
files_exist - File not found check:
parameters.settings.json -
files_exist - File not found check:
pipeline_template.yml -
files_exist - File not found check:
Singularity -
files_exist - File not found check:
lib/nfcore_external_java_deps.jar -
files_exist - File not found check:
.travis.yml - nextflow_config - Found nf-schema plugin
-
nextflow_config - Config variable found:
manifest.name -
nextflow_config - Config variable found:
manifest.nextflowVersion -
nextflow_config - Config variable found:
manifest.description -
nextflow_config - Config variable found:
manifest.version -
nextflow_config - Config variable found:
manifest.homePage -
nextflow_config - Config variable found:
timeline.enabled -
nextflow_config - Config variable found:
trace.enabled -
nextflow_config - Config variable found:
report.enabled -
nextflow_config - Config variable found:
dag.enabled -
nextflow_config - Config variable found:
process.cpus -
nextflow_config - Config variable found:
process.memory -
nextflow_config - Config variable found:
process.time -
nextflow_config - Config variable found:
params.outdir -
nextflow_config - Config variable found:
params.input -
nextflow_config - Config variable found:
validation.help.enabled -
nextflow_config - Config variable found:
manifest.mainScript -
nextflow_config - Config variable found:
timeline.file -
nextflow_config - Config variable found:
trace.file -
nextflow_config - Config variable found:
report.file -
nextflow_config - Config variable found:
dag.file -
nextflow_config - Config variable found:
validation.help.beforeText -
nextflow_config - Config variable found:
validation.help.afterText -
nextflow_config - Config variable found:
validation.help.command -
nextflow_config - Config variable found:
validation.summary.beforeText -
nextflow_config - Config variable found:
validation.summary.afterText -
nextflow_config - Config variable (correctly) not found:
params.nf_required_version -
nextflow_config - Config variable (correctly) not found:
params.container -
nextflow_config - Config variable (correctly) not found:
params.singleEnd -
nextflow_config - Config variable (correctly) not found:
params.igenomesIgnore -
nextflow_config - Config variable (correctly) not found:
params.name -
nextflow_config - Config variable (correctly) not found:
params.enable_conda -
nextflow_config - Config variable (correctly) not found:
params.max_cpus -
nextflow_config - Config variable (correctly) not found:
params.max_memory -
nextflow_config - Config variable (correctly) not found:
params.max_time -
nextflow_config - Config variable (correctly) not found:
params.validationFailUnrecognisedParams -
nextflow_config - Config variable (correctly) not found:
params.validationLenientMode -
nextflow_config - Config variable (correctly) not found:
params.validationSchemaIgnoreParams -
nextflow_config - Config variable (correctly) not found:
params.validationShowHiddenParams -
nextflow_config - Config
timeline.enabledhad correct value:true -
nextflow_config - Config
report.enabledhad correct value:true -
nextflow_config - Config
trace.enabledhad correct value:true -
nextflow_config - Config
dag.enabledhad correct value:true -
nextflow_config - Config
manifest.namebegan withnf-core/ -
nextflow_config - Config variable
manifest.homePagebegan with https://github.com/nf-core/ -
nextflow_config - Config
dag.fileended with.html -
nextflow_config - Config variable
manifest.nextflowVersionstarted with >= or !>= -
nextflow_config - Config
manifest.versionends indev:3.5.0dev -
nextflow_config - Config
params.custom_config_versionis set tomaster -
nextflow_config - Config
params.custom_config_baseis set tohttps://raw.githubusercontent.com/nf-core/configs/master - nextflow_config - Lines for loading custom profiles found
-
nextflow_config - nextflow.config contains configuration profile
test - nextflow_config - Config default value correct: params.step= mapping
- nextflow_config - Config default value correct: params.split_fastq= 50000000
- nextflow_config - Config default value correct: params.nucleotides_per_second= 200000
- nextflow_config - Config default value correct: params.clip_r1= 0
- nextflow_config - Config default value correct: params.clip_r2= 0
- nextflow_config - Config default value correct: params.three_prime_clip_r1= 0
- nextflow_config - Config default value correct: params.three_prime_clip_r2= 0
- nextflow_config - Config default value correct: params.trim_nextseq= 0
- nextflow_config - Config default value correct: params.length_required= 15
- nextflow_config - Config default value correct: params.group_by_umi_strategy= Adjacency
- nextflow_config - Config default value correct: params.aligner= bwa-mem
- nextflow_config - Config default value correct: params.ascat_min_base_qual= 20
- nextflow_config - Config default value correct: params.ascat_min_counts= 10
- nextflow_config - Config default value correct: params.ascat_min_map_qual= 35
- nextflow_config - Config default value correct: params.cf_coeff= 0.05
- nextflow_config - Config default value correct: params.cf_contamination= 0
- nextflow_config - Config default value correct: params.cf_minqual= 0
- nextflow_config - Config default value correct: params.cf_mincov= 0
- nextflow_config - Config default value correct: params.cf_ploidy= 2
- nextflow_config - Config default value correct: params.sentieon_haplotyper_emit_mode= variant
- nextflow_config - Config default value correct: params.sentieon_dnascope_emit_mode= variant
- nextflow_config - Config default value correct: params.sentieon_dnascope_pcr_indel_model= CONSERVATIVE
- nextflow_config - Config default value correct: params.dbnsfp_fields= rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF
- nextflow_config - Config default value correct: params.vep_custom_args= --everything --filter_common --per_gene --total_length --offline --format vcf
- nextflow_config - Config default value correct: params.vep_version= 111.0-0
- nextflow_config - Config default value correct: params.vep_out_format= vcf
- nextflow_config - Config default value correct: params.igenomes_base= s3://ngi-igenomes/igenomes/
- nextflow_config - Config default value correct: params.genome= GATK.GRCh38
- nextflow_config - Config default value correct: params.snpeff_cache= s3://annotation-cache/snpeff_cache/
- nextflow_config - Config default value correct: params.vep_cache= s3://annotation-cache/vep_cache/
- nextflow_config - Config default value correct: params.custom_config_version= master
- nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
- nextflow_config - Config default value correct: params.test_data_base= https://raw.githubusercontent.com/nf-core/test-datasets/sarek3
- nextflow_config - Config default value correct: params.seq_platform= ILLUMINA
- nextflow_config - Config default value correct: params.publish_dir_mode= copy
- nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
- nextflow_config - Config default value correct: params.validate_params= true
- nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
-
files_unchanged -
.gitattributesmatches the template -
files_unchanged -
.prettierrc.ymlmatches the template -
files_unchanged -
CODE_OF_CONDUCT.mdmatches the template -
files_unchanged -
LICENSEmatches the template -
files_unchanged -
.github/.dockstore.ymlmatches the template -
files_unchanged -
.github/CONTRIBUTING.mdmatches the template -
files_unchanged -
.github/ISSUE_TEMPLATE/bug_report.ymlmatches the template -
files_unchanged -
.github/ISSUE_TEMPLATE/config.ymlmatches the template -
files_unchanged -
.github/ISSUE_TEMPLATE/feature_request.ymlmatches the template -
files_unchanged -
.github/PULL_REQUEST_TEMPLATE.mdmatches the template -
files_unchanged -
.github/workflows/branch.ymlmatches the template -
files_unchanged -
.github/workflows/linting_comment.ymlmatches the template -
files_unchanged -
.github/workflows/linting.ymlmatches the template -
files_unchanged -
assets/email_template.htmlmatches the template -
files_unchanged -
assets/email_template.txtmatches the template -
files_unchanged -
assets/sendmail_template.txtmatches the template -
files_unchanged -
docs/README.mdmatches the template -
readme - README Nextflow minimum version badge matched config. Badge:
24.04.2, Config:24.04.2 - readme - README Zenodo placeholder was replaced with DOI.
- plugin_includes - No wrong validation plugin imports have been found
- pipeline_name_conventions - Name adheres to nf-core convention
- schema_lint - Schema lint passed
- schema_lint - Schema title + description lint passed
- schema_lint - Input mimetype lint passed: 'text/csv'
- schema_params - Schema matched params returned from nextflow config
-
system_exit - No
System.exitcalls found - actions_schema_validation - Workflow validation passed: linting.yml
- actions_schema_validation - Workflow validation passed: branch.yml
- actions_schema_validation - Workflow validation passed: fix-linting.yml
- actions_schema_validation - Workflow validation passed: release-announcements.yml
- actions_schema_validation - Workflow validation passed: template_version_comment.yml
- actions_schema_validation - Workflow validation passed: download_pipeline.yml
- actions_schema_validation - Workflow validation passed: ncbench.yml
- actions_schema_validation - Workflow validation passed: ci.yml
- actions_schema_validation - Workflow validation passed: cloudtest.yml
- actions_schema_validation - Workflow validation passed: pytest.yml
- actions_schema_validation - Workflow validation passed: clean-up.yml
- actions_schema_validation - Workflow validation passed: linting_comment.yml
- merge_markers - No merge markers found in pipeline files
-
modules_json - Only installed modules found in
modules.json -
multiqc_config -
assets/multiqc_config.ymlfound and not ignored. -
multiqc_config -
assets/multiqc_config.ymlcontainsreport_section_order -
multiqc_config -
assets/multiqc_config.ymlcontainsexport_plots -
multiqc_config -
assets/multiqc_config.ymlcontainsreport_comment -
multiqc_config -
assets/multiqc_config.ymlfollows the ordering scheme of the minimally required plugins. -
multiqc_config -
assets/multiqc_config.ymlcontains a matching 'report_comment'. -
multiqc_config -
assets/multiqc_config.ymlcontains 'export_plots: true'. - modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
-
base_config -
conf/base.configfound and not ignored. -
base_config -
UNZIPfound inconf/base.configand Nextflow scripts. -
base_config -
FASTQCfound inconf/base.configand Nextflow scripts. -
base_config -
FASTPfound inconf/base.configand Nextflow scripts. -
base_config -
BWAMEM1_MEMfound inconf/base.configand Nextflow scripts. -
base_config -
CNVKIT_BATCHfound inconf/base.configand Nextflow scripts. -
base_config -
GATK4_MARKDUPLICATESfound inconf/base.configand Nextflow scripts. -
base_config -
GATK4_APPLYBQSRfound inconf/base.configand Nextflow scripts. -
base_config -
MOSDEPTHfound inconf/base.configand Nextflow scripts. -
base_config -
STRELKAfound inconf/base.configand Nextflow scripts. -
base_config -
SAMTOOLS_CONVERTfound inconf/base.configand Nextflow scripts. -
base_config -
GATK4_MERGEVCFSfound inconf/base.configand Nextflow scripts. -
base_config -
MULTIQCfound inconf/base.configand Nextflow scripts. -
nfcore_yml - Repository type in
.nf-core.ymlis valid:pipeline -
nfcore_yml - nf-core version in
.nf-core.ymlis set to the latest version:3.0.2
Run details
- nf-core/tools version 3.0.2
- Run at
2024-12-04 06:57:12
@nf-core-bot fix linting pretty please :pray:
We're missing CHANGELOG + tests + subway map
@nf-core-bot fix linting pretty please :pray:
Can you add this in sarek/tests/config/pytesttags.yml after the concatenate_vcfs trigger?
normalize_vcfs:
- conf/modules/post_variant_calling.config
- modules/nf-core/bcftools/concat/**
- modules/nf-core/bcftools/mpileup/**
- modules/nf-core/bcftools/norm/**
- modules/nf-core/bcftools/sort/**
- modules/nf-core/deepvariant/**
- modules/nf-core/freebayes/**
- modules/nf-core/gatk4/haplotypecaller/**
- modules/nf-core/gatk4/mergevcfs/**
- modules/nf-core/manta/germline/**
- modules/nf-core/samtools/mpileup/**
- modules/nf-core/strelka/germline/**
- modules/nf-core/tabix/bgziptabix/**
- modules/nf-core/tabix/tabix/**
- modules/nf-core/tiddit/sv/**
- subworkflows/local/bam_variant_calling_deepvariant/**
- subworkflows/local/bam_variant_calling_freebayes/**
- subworkflows/local/bam_variant_calling_germline_all/**
- subworkflows/local/bam_variant_calling_germline_manta/**
- subworkflows/local/bam_variant_calling_haplotypecaller/**
- subworkflows/local/bam_variant_calling_mpileup/**
- subworkflows/local/bam_variant_calling_single_strelka/**
- subworkflows/local/bam_variant_calling_single_tiddit/**
- subworkflows/local/bam_variant_calling_somatic_all/**
- subworkflows/local/bam_variant_calling_tumor_only_all/**
- subworkflows/local/post_variantcalling/**
- subworkflows/local/vcf_concatenate_germline/**
- tests/csv/3.0/mapped_joint_bam.csv
- tests/test_normalize_vcfs.yml
issues we still need to assess:
WHY do we output vcfs_tbi from the concatenate subworkflow, when we just need vcf for vcftools and we don't seem to remove them anywhere? I think we probably need to output just vcf from there, or keep it somewhere and map it out for the downstream processes. I have little clues why it's not failing.
We need a variant caller id from concatenate as well.
I'm guessing we might need to output vcfs = VCFS_NORM_SORT.out.vcf from the normalization subworkflows and something similar from the concatenate one.
[!WARNING] Newer version of the nf-core template is available.
Your pipeline is using an old version of the nf-core template: 3.0.2. Please update your pipeline to the latest version.
For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.
@Patricie34 sorry, I forgot to merge in https://github.com/nf-core/sarek/pull/1760 (which is now done).
Can you sync you branch once more, and move your PR up in the CHANGELOG?
We'll be merging in dev_normalizationso we can easilly test logic and figure out everything before finally merging in in dev