sarek icon indicating copy to clipboard operation
sarek copied to clipboard

Mutect2 tumor vs normal does not perform FILTERMUTECTCALLS step

Open msguixe opened this issue 1 year ago • 5 comments

Description of the bug

Hi,

I just realized that when running Mutect2, in tumor vs normal mode, it does not perform the FILTERMUTECTCALLS step. When I run it in single tumor mode, it does perform the filtering step. How can I add the filtering step with the sarek command? Thanks,

Command used and terminal output

nextflow run nf-core/sarek -r 3.0.1 -profile singularity -c /workspace/datasets/sjd_seq/code/sarek/sarek_3.0.1.conf --input input_clones_vs_blood.csv --step variant_calling --tools 'strelka,ascat,manta,mutect2' --fasta /workspace/datasets/genomes/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fa --fasta_fai /workspace/datasets/genomes/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fa.fai --dict /workspace/datasets/genomes/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.dict --germline_resource /workspace/datasets/genomes/iGenomes/Homo_sapiens/GATK/GRCh38/Annotation/GermlineResource/gnomAD.r2.1.1.GRCh38.PASS.AC.AF.only.vcf.gz --germline_resource_tbi /workspace/datasets/genomes/iGenomes/Homo_sapiens/GATK/GRCh38/Annotation/GermlineResource/gnomAD.r2.1.1.GRCh38.PASS.AC.AF.only.vcf.gz.tbi

Relevant files

patient,sex,status,sample,cram,crai pt1,XX,1,AX4958,/workspace/datasets/sjd_seq/sarek_results/pt1/results/preprocessing/recalibrated/AX4958/AX4958.recal.cram,/workspace/datasets/sjd_seq/sarek_results/pt1/results/preprocessing/recalibrated/AX4958/AX4958.recal.cram.crai pt1,XX,1,AX4961,/workspace/datasets/sjd_seq/sarek_results/pt1/results/preprocessing/recalibrated/AX4961/AX4961.recal.cram,/workspace/datasets/sjd_seq/sarek_results/pt1/results/preprocessing/recalibrated/AX4961/AX4961.recal.cram.crai pt1,XX,0,AQ5174,/workspace/datasets/sjd_seq/platinum_results/20220809/pt1-t1-allsamples-t1/sarek_results/results/preprocessing/recalibrated/AQ5174/AQ5174.recal.cram,/workspace/datasets/sjd_seq/platinum_results/20220809/pt1-t1-allsamples-t1/sarek_results/results/preprocessing/recalibrated/AQ5174/AQ5174.recal.cram.crai

System information

No response

msguixe avatar Oct 11 '22 07:10 msguixe

Thanks for reporting. Taking a look

FriederikeHanssen avatar Oct 11 '22 10:10 FriederikeHanssen

Can you send me the log and ideally the terminal output? Was GETPILEUPSUMMARIES run?

FriederikeHanssen avatar Oct 11 '22 10:10 FriederikeHanssen

execution_trace_2022-10-10_11-39-31.txt.gz This is the execution trace file. It looks like the GETPILEUPSUMMARIES has been executed.

msguixe avatar Oct 11 '22 12:10 msguixe

I would need the .nextflow.log file please

FriederikeHanssen avatar Oct 11 '22 13:10 FriederikeHanssen

nextflow.log.gz

msguixe avatar Oct 11 '22 13:10 msguixe

Hi,

Did you have the chance to look into that?

Thanks,

msguixe avatar Oct 24 '22 12:10 msguixe

Hi @msguixe ! Apologies for the delay, I was on vacation, but back now. I put it in my todo list for next week. I am afraid I am busy catching up with everything this week.

FriederikeHanssen avatar Nov 03 '22 15:11 FriederikeHanssen

Hey! So trying to track this down. On the previous release 3.0.2 it was run on the full size tests: https://tower.nf/orgs/nf-core/workspaces/AWSmegatests/watch/1VrKQdBhYDJB1G

https://nf-co.re/sarek/results#sarek/results-bcd7bf9cb98cddec27bb54fb47ee122c09388c02/somatic_test/variant_calling/mutect2/HCC1395T_vs_HCC1395N/

So I think a reference file might be missing from your command possibly

FriederikeHanssen avatar Nov 07 '22 22:11 FriederikeHanssen

😬 can't reproduce this. Can you try the same command with 3.0.2?

FriederikeHanssen avatar Nov 07 '22 22:11 FriederikeHanssen

Hi Friederike,

It is currently running. I let you know about the output tomorrow!

Thanks, Monica

msguixe avatar Nov 08 '22 15:11 msguixe

The pipeline finished correctly, but still it doesn't give the filtered vcf :/ I'm not sure which ref file is missing...

msguixe avatar Nov 09 '22 09:11 msguixe

Hi! Ok thanks this good to know. then i know where I can start digging further

FriederikeHanssen avatar Nov 09 '22 09:11 FriederikeHanssen

OK so making progress on the debugging side: if I am not mistaken and overlooked it in your log files: LEARNREADORIENTATION is not run, is that correct?

Could you also navigate to the work dir of one of the mutect2 processes and send me the .command.sh?

FriederikeHanssen avatar Nov 09 '22 21:11 FriederikeHanssen

also would you mind testing dev as well with your setup?

FriederikeHanssen avatar Nov 09 '22 21:11 FriederikeHanssen

It is actually the FILTERMUTECTCALLS that it is not performed. I tried the dev branch but it gives this error:

No TSV file
No step other than "mapping" supports a directory as an input

msguixe avatar Nov 10 '22 15:11 msguixe

Well I see that LEARNREADORIENTATION is also not performed... This is a .command.sh from a MUTECT2 step:

#!/bin/bash -euo pipefail
gatk --java-options "-Xmx36g" Mutect2 \
    --input AQ5174.recal.cram --input AX4961.recal.cram \
    --output AX4961_vs_AQ5174.mutect2.chr22_10510001-10784643.vcf.gz \
    --reference GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fa \
    --panel-of-normals 1000g_pon.hg38.vcf.gz \
    --germline-resource gnomAD.r2.1.1.GRCh38.PASS.AC.AF.only.vcf.gz \
    --intervals chr22_10510001-10784643.bed \
    --tmp-dir . \
    --normal-sample null

cat <<-END_VERSIONS > versions.yml
"NFCORE_SAREK:SAREK:PAIR_VARIANT_CALLING:GATK_TUMOR_NORMAL_SOMATIC_VARIANT_CALLING:MUTECT2":
    gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
END_VERSIONS

msguixe avatar Nov 10 '22 15:11 msguixe

Yes the that process is upstream of the filtering, and if that one isn't run then the filtering can't run because it is missing some inputfiles.

FriederikeHanssen avatar Nov 10 '22 19:11 FriederikeHanssen

ah ha, so somehow the part of the config that tells the tools to generate f1r2 files for the LEARNREADORIENTATION is not passed through and also the normal-sample is set to "null" 😱 so your pairing doesn't actually work

FriederikeHanssen avatar Nov 10 '22 19:11 FriederikeHanssen

It is actually the FILTERMUTECTCALLS that it is not performed. I tried the dev branch but it gives this error:

No TSV file
No step other than "mapping" supports a directory as an input

how did you run dev?

FriederikeHanssen avatar Nov 10 '22 19:11 FriederikeHanssen

When you add this in a cusotm config:

process {

   withName: 'NFCORE_SAREK:SAREK:PAIR_VARIANT_CALLING:GATK_TUMOR_NORMAL_SOMATIC_VARIANT_CALLING:MUTECT2' {
   ext.args         = { params.ignore_soft_clipped_bases ?
                                "--dont-use-soft-clipped-bases true --f1r2-tar-gz ${task.ext.prefix}.f1r2.tar.gz --normal-sample ${meta.patient}_${meta.normal_id}" :
                                "--f1r2-tar-gz ${task.ext.prefix}.f1r2.tar.gz --normal-sample ${meta.patient}_${meta.normal_id}" }
   
   }
   
}

and then add it to the command with -c custom.config , does it then work?

FriederikeHanssen avatar Nov 10 '22 19:11 FriederikeHanssen

It is actually the FILTERMUTECTCALLS that it is not performed. I tried the dev branch but it gives this error:

No TSV file
No step other than "mapping" supports a directory as an input

how did you run dev?

I am running it changing -r 3.0.2 to -r dev

msguixe avatar Nov 11 '22 11:11 msguixe

When you add this in a cusotm config:

process {

   withName: 'NFCORE_SAREK:SAREK:PAIR_VARIANT_CALLING:GATK_TUMOR_NORMAL_SOMATIC_VARIANT_CALLING:MUTECT2' {
   ext.args         = { params.ignore_soft_clipped_bases ?
                                "--dont-use-soft-clipped-bases true --f1r2-tar-gz ${task.ext.prefix}.f1r2.tar.gz --normal-sample ${meta.patient}_${meta.normal_id}" :
                                "--f1r2-tar-gz ${task.ext.prefix}.f1r2.tar.gz --normal-sample ${meta.patient}_${meta.normal_id}" }
   
   }
   
}

and then add it to the command with -c custom.config , does it then work?

So it is now running, with -r 3.0.2. I let you when it finishes.

msguixe avatar Nov 11 '22 11:11 msguixe

It has finished with an error:

***********************************************************************

A USER ERROR has occurred: Bad input: Sample pt1_AQ5174 is not in BAM header: [AQ5174, pt1_AX4958]

***********************************************************************

Could this be related to having added this to the conf file? That was a way of solving a previous problem that I had:

process{
   withName: 'NFCORE_SAREK:SAREK:PAIR_VARIANT_CALLING:GATK_TUMOR_NORMAL_SOMATIC_VARIANT_CALLING:MUTECT2' {
      ext.args = { "--normal-sample ${meta.sample}" }
   }
}

msguixe avatar Nov 11 '22 13:11 msguixe

It is actually the FILTERMUTECTCALLS that it is not performed. I tried the dev branch but it gives this error:

No TSV file
No step other than "mapping" supports a directory as an input

how did you run dev?

I am running it changing -r 3.0.2 to -r dev

this looks like it picked up a sarek2 version....

FriederikeHanssen avatar Nov 11 '22 13:11 FriederikeHanssen

Can you do a nextflow run -r dev -latest ...

maxulysse avatar Nov 11 '22 13:11 maxulysse

or a nextflow pull nf-core/sarek -r dev before the nextflow run ... command

maxulysse avatar Nov 11 '22 13:11 maxulysse

What infrastructure are you running on?

FriederikeHanssen avatar Nov 11 '22 13:11 FriederikeHanssen

Can you do a nextflow run -r dev -latest ...

It looks like this worked. Trying now without the config options about sample naming...

msguixe avatar Nov 11 '22 13:11 msguixe

What infrastructure are you running on?

HPC cluster with CentOS Linux, and slurm

msguixe avatar Nov 11 '22 13:11 msguixe

Hi,

Sorry it took me a while to answer, I was busy these days. Using the -r dev -latest I have the same error: A USER ERROR has occurred: Bad input: Sample pt1_AQ5174 is not in BAM header: [AQ5174, pt1_AX4958] The pipeline finished the 11/11/22, so if you have made improvements these days these may not have been applied in this run.

I saw you released the 3.1 version. Maybe should I try with this one?

msguixe avatar Nov 18 '22 10:11 msguixe