sarek icon indicating copy to clipboard operation
sarek copied to clipboard

Sarek bcftools normalization

Open Patricie34 opened this issue 1 year ago • 6 comments

PR checklist

  • [x] This comment contains a description of changes (with reason).
  • [ ] If you've fixed a bug or added code that should be tested, add tests!
  • [x] If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • [ ] If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • [ ] Make sure your code lints (nf-core lint).
  • [ ] Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • [ ] Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • [ ] Usage Documentation in docs/usage.md is updated.
  • [ ] Output Documentation in docs/output.md is updated.
  • [x] CHANGELOG.md is updated.
  • [ ] README.md is updated (including new tool citations and authors/contributors).

Patricie34 avatar Oct 09 '24 17:10 Patricie34

Hi all,

I've modified the normalization step to include all VCFs, not just the germline ones. For this, I used the pull request from JC-Delmas as a base. I am aware that this still requires a lot of work, and I would greatly appreciate any advice or feedback you can provide.

Thank you!

Patricie

Patricie34 avatar Oct 09 '24 17:10 Patricie34

nf-core pipelines lint overall result: Passed :white_check_mark: :warning:

Posted for pipeline commit 6c39a63

+| ✅ 215 tests passed       |+
#| ❔  11 tests were ignored |#
!| ❗   4 tests had warnings |!

:heavy_exclamation_mark: Test warnings:

  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes

:grey_question: Tests ignored:

:white_check_mark: Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-12-04 06:57:12

github-actions[bot] avatar Oct 10 '24 06:10 github-actions[bot]

@nf-core-bot fix linting pretty please :pray:

maxulysse avatar Oct 10 '24 06:10 maxulysse

We're missing CHANGELOG + tests + subway map

maxulysse avatar Oct 10 '24 06:10 maxulysse

@nf-core-bot fix linting pretty please :pray:

maxulysse avatar Oct 11 '24 10:10 maxulysse

Can you add this in sarek/tests/config/pytesttags.yml after the concatenate_vcfs trigger?

normalize_vcfs:
  - conf/modules/post_variant_calling.config
  - modules/nf-core/bcftools/concat/**
  - modules/nf-core/bcftools/mpileup/**
  - modules/nf-core/bcftools/norm/**
  - modules/nf-core/bcftools/sort/**
  - modules/nf-core/deepvariant/**
  - modules/nf-core/freebayes/**
  - modules/nf-core/gatk4/haplotypecaller/**
  - modules/nf-core/gatk4/mergevcfs/**
  - modules/nf-core/manta/germline/**
  - modules/nf-core/samtools/mpileup/**
  - modules/nf-core/strelka/germline/**
  - modules/nf-core/tabix/bgziptabix/**
  - modules/nf-core/tabix/tabix/**
  - modules/nf-core/tiddit/sv/**
  - subworkflows/local/bam_variant_calling_deepvariant/**
  - subworkflows/local/bam_variant_calling_freebayes/**
  - subworkflows/local/bam_variant_calling_germline_all/**
  - subworkflows/local/bam_variant_calling_germline_manta/**
  - subworkflows/local/bam_variant_calling_haplotypecaller/**
  - subworkflows/local/bam_variant_calling_mpileup/**
  - subworkflows/local/bam_variant_calling_single_strelka/**
  - subworkflows/local/bam_variant_calling_single_tiddit/**
  - subworkflows/local/bam_variant_calling_somatic_all/**
  - subworkflows/local/bam_variant_calling_tumor_only_all/**
  - subworkflows/local/post_variantcalling/**
  - subworkflows/local/vcf_concatenate_germline/**
  - tests/csv/3.0/mapped_joint_bam.csv
  - tests/test_normalize_vcfs.yml

maxulysse avatar Oct 17 '24 12:10 maxulysse

issues we still need to assess:

WHY do we output vcfs_tbi from the concatenate subworkflow, when we just need vcf for vcftools and we don't seem to remove them anywhere? I think we probably need to output just vcf from there, or keep it somewhere and map it out for the downstream processes. I have little clues why it's not failing.

We need a variant caller id from concatenate as well.

I'm guessing we might need to output vcfs = VCFS_NORM_SORT.out.vcf from the normalization subworkflows and something similar from the concatenate one.

maxulysse avatar Nov 25 '24 15:11 maxulysse

[!WARNING] Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.0.2. Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

nf-core-bot avatar Dec 17 '24 06:12 nf-core-bot

@Patricie34 sorry, I forgot to merge in https://github.com/nf-core/sarek/pull/1760 (which is now done).

Can you sync you branch once more, and move your PR up in the CHANGELOG?

maxulysse avatar Dec 17 '24 17:12 maxulysse

We'll be merging in dev_normalizationso we can easilly test logic and figure out everything before finally merging in in dev

maxulysse avatar Jan 13 '25 13:01 maxulysse