sarek icon indicating copy to clipboard operation
sarek copied to clipboard

same sample name over multiple patient doest not fail input schema validation

Open maxulysse opened this issue 9 months ago • 1 comments

Description of the bug

This is the output that is seen on the terminal once the pipeline has failed after GATK4_MARKDUPLICATES, my guess is that one of the later join operator is causing the subsequent failure:

Detected join operation duplicate emission on left channel -- offending element: key=[patient:test2, sample:test, sex:XX, status:0, n_fastq:1, data_type:bam, id:test]; value=/home/max/workspace/sarek/work/fe/2e8890cae572ee686c7475edd6e895/test.md.cram

We should really fail early for that.

Issue reported by Ist4lri

Command used and terminal output

No response

Relevant files

No response

System information

No response

maxulysse avatar May 06 '24 13:05 maxulysse

  1. Command used and terminal output :
nextflow run nf-core/sarek -r dev -profile singularity -c custom.config -params-file nf-params.json
Error : Detected join operation duplicate emission on left channel -- offending element: key=[patient:test2, sample:test, sex:XX, status:0, n_fastq:1, data_type:bam, id:test]; value=/home/max/workspace/sarek/work/fe/2e8890cae572ee686c7475edd6e895/test.md.cram
  1. Relevant files :

With this sample :

patient,sample,lane,fastq_1,fastq_2,status
BR664F,liver,1,/path/to/the/file/BR664F_R1.fastq.gz,/path/to/the/file/BR664F_R2.fastq.gz,1
BR665F,liver,1,/path/to/the/file/BR665F_R1.fastq.gz,/path/to/the/file/BR665F_R2.fastq.gz,1
BR666F,liver,1,/path/to/the/file/BR666F_R1.fastq.gz,/path/to/the/file/BR666F_R2.fastq.gz,1
BR667F,liver,1,/path/to/the/file/BR667F_R1.fastq.gz,/path/to/the/file/BR667F_R2.fastq.gz,1
BR668F,liver,1,/path/to/the/file/BR668F_R1.fastq.gz,/path/to/the/file/BR668F_R2.fastq.gz,1
BR669F,liver,1,/path/to/the/file/BR669F_R1.fastq.gz,/path/to/the/file/BR669F_R2.fastq.gz,1
BR670F,liver,1,/path/to/the/file/BR670F_R1.fastq.gz,/path/to/the/file/BR670F_R2.fastq.gz,1
BR671F,liver,1,/path/to/the/file/BR671F_R1.fastq.gz,/path/to/the/file/BR671F_R2.fastq.gz,1
{
    "input": "sample.csv",
    "outdir": "results",
    "wes": "true",
    "fasta": "/path/to/this/file/GRCh38_latest_genomic.fna",
    "aligner": "bwa-mem2",
    "skip_tools": "baserecalibrator,markduplicates"
}
  1. System Information

HPC Curta on MCIA (Mésocentre calcul intensif aquitain) I downloaded sarek on local files in cluster, because there is no profile on this cluster (not the same than IFB.)

Ist4lri avatar May 07 '24 06:05 Ist4lri