Check if flowcell id matches for paired samples
I noticed this comment about checking the flowcell ID for paired samples while constructing GATK read groups. I was adapting the read group code for a custom pipeline and attempted a quick fix, so I thought I'd contribute it back to sarek.
While constructing the read group from paired fastq samples, perform a check to ensure that the id is the same for (the first reads) in fastq_1 and fastq_2. Exit out with an error otherwise and report the problematic sample and file paths.
Incidentally, while researching read groups I came across the following recommendations: https://support.sentieon.com/appnotes/read_groups/. Would it be worth updating some of the fields to match these guidelines?
PR checklist
- [x] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- => Only tested this manually, but happy to add a proper test if you could give me a starting point. Is there already an existing test for samplesheet validation that I can add this too? I guess I will need to add "corrupt" fastq files to the nf-core test repo?
- [ ] If you've added a new tool - have you followed the pipeline conventions in the contribution docs
- [ ] If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
- [x] Make sure your code lints (
nf-core lint). - [x] Ensure the test suite passes (
nextflow run . -profile test,docker --outdir <OUTDIR>). - [x] Check for unexpected warnings in debug mode (
nextflow run . -profile debug,test,docker --outdir <OUTDIR>). - [ ] Usage Documentation in
docs/usage.mdis updated. - [ ] Output Documentation in
docs/output.mdis updated. - [ ]
CHANGELOG.mdis updated.- => will do this after submitting the PR so that I can link to it.
- [ ]
README.mdis updated (including new tool citations and authors/contributors).- => should I do this even for such a minor contribution?
nf-core pipelines lint overall result: Passed :white_check_mark: :warning:
Posted for pipeline commit 82615ad
+| ✅ 215 tests passed |+
#| ❔ 11 tests were ignored |#
!| ❗ 4 tests had warnings |!
:heavy_exclamation_mark: Test warnings:
- pipeline_todos - TODO string in
main.nf: Optionally add in-text citation tools to this list. - pipeline_todos - TODO string in
main.nf: Optionally add bibliographic entries to this list. - pipeline_todos - TODO string in
main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled! - pipeline_todos - TODO string in
base.config: Check the defaults for all processes
:grey_question: Tests ignored:
- files_exist - File is ignored:
.github/workflows/awsfulltest.yml - files_exist - File is ignored:
.github/workflows/awstest.yml - files_exist - File is ignored:
conf/modules.config - files_unchanged - File ignored due to lint config:
assets/nf-core-sarek_logo_light.png - files_unchanged - File ignored due to lint config:
docs/images/nf-core-sarek_logo_light.png - files_unchanged - File ignored due to lint config:
docs/images/nf-core-sarek_logo_dark.png - files_unchanged - File ignored due to lint config:
.gitignoreor.prettierignore - actions_ci - actions_ci
- actions_awstest - 'awstest.yml' workflow not found:
/home/runner/work/sarek/sarek/.github/workflows/awstest.yml - template_strings - template_strings
- modules_config - modules_config
:white_check_mark: Tests passed:
- files_exist - File found:
.gitattributes - files_exist - File found:
.gitignore - files_exist - File found:
.nf-core.yml - files_exist - File found:
.editorconfig - files_exist - File found:
.prettierignore - files_exist - File found:
.prettierrc.yml - files_exist - File found:
CHANGELOG.md - files_exist - File found:
CITATIONS.md - files_exist - File found:
CODE_OF_CONDUCT.md - files_exist - File found:
LICENSEorLICENSE.mdorLICENCEorLICENCE.md - files_exist - File found:
nextflow_schema.json - files_exist - File found:
nextflow.config - files_exist - File found:
README.md - files_exist - File found:
.github/.dockstore.yml - files_exist - File found:
.github/CONTRIBUTING.md - files_exist - File found:
.github/ISSUE_TEMPLATE/bug_report.yml - files_exist - File found:
.github/ISSUE_TEMPLATE/config.yml - files_exist - File found:
.github/ISSUE_TEMPLATE/feature_request.yml - files_exist - File found:
.github/PULL_REQUEST_TEMPLATE.md - files_exist - File found:
.github/workflows/branch.yml - files_exist - File found:
.github/workflows/ci.yml - files_exist - File found:
.github/workflows/linting_comment.yml - files_exist - File found:
.github/workflows/linting.yml - files_exist - File found:
assets/email_template.html - files_exist - File found:
assets/email_template.txt - files_exist - File found:
assets/sendmail_template.txt - files_exist - File found:
assets/nf-core-sarek_logo_light.png - files_exist - File found:
conf/test.config - files_exist - File found:
conf/test_full.config - files_exist - File found:
docs/images/nf-core-sarek_logo_light.png - files_exist - File found:
docs/images/nf-core-sarek_logo_dark.png - files_exist - File found:
docs/output.md - files_exist - File found:
docs/README.md - files_exist - File found:
docs/README.md - files_exist - File found:
docs/usage.md - files_exist - File found:
main.nf - files_exist - File found:
assets/multiqc_config.yml - files_exist - File found:
conf/base.config - files_exist - File found:
conf/igenomes.config - files_exist - File found:
conf/igenomes_ignored.config - files_exist - File found:
modules.json - files_exist - File not found check:
.github/ISSUE_TEMPLATE/bug_report.md - files_exist - File not found check:
.github/ISSUE_TEMPLATE/feature_request.md - files_exist - File not found check:
.github/workflows/push_dockerhub.yml - files_exist - File not found check:
.markdownlint.yml - files_exist - File not found check:
.nf-core.yaml - files_exist - File not found check:
.yamllint.yml - files_exist - File not found check:
bin/markdown_to_html.r - files_exist - File not found check:
conf/aws.config - files_exist - File not found check:
docs/images/nf-core-sarek_logo.png - files_exist - File not found check:
lib/Checks.groovy - files_exist - File not found check:
lib/Completion.groovy - files_exist - File not found check:
lib/NfcoreTemplate.groovy - files_exist - File not found check:
lib/Utils.groovy - files_exist - File not found check:
lib/Workflow.groovy - files_exist - File not found check:
lib/WorkflowMain.groovy - files_exist - File not found check:
lib/WorkflowSarek.groovy - files_exist - File not found check:
parameters.settings.json - files_exist - File not found check:
pipeline_template.yml - files_exist - File not found check:
Singularity - files_exist - File not found check:
lib/nfcore_external_java_deps.jar - files_exist - File not found check:
.travis.yml - nextflow_config - Found nf-schema plugin
- nextflow_config - Config variable found:
manifest.name - nextflow_config - Config variable found:
manifest.nextflowVersion - nextflow_config - Config variable found:
manifest.description - nextflow_config - Config variable found:
manifest.version - nextflow_config - Config variable found:
manifest.homePage - nextflow_config - Config variable found:
timeline.enabled - nextflow_config - Config variable found:
trace.enabled - nextflow_config - Config variable found:
report.enabled - nextflow_config - Config variable found:
dag.enabled - nextflow_config - Config variable found:
process.cpus - nextflow_config - Config variable found:
process.memory - nextflow_config - Config variable found:
process.time - nextflow_config - Config variable found:
params.outdir - nextflow_config - Config variable found:
params.input - nextflow_config - Config variable found:
validation.help.enabled - nextflow_config - Config variable found:
manifest.mainScript - nextflow_config - Config variable found:
timeline.file - nextflow_config - Config variable found:
trace.file - nextflow_config - Config variable found:
report.file - nextflow_config - Config variable found:
dag.file - nextflow_config - Config variable found:
validation.help.beforeText - nextflow_config - Config variable found:
validation.help.afterText - nextflow_config - Config variable found:
validation.help.command - nextflow_config - Config variable found:
validation.summary.beforeText - nextflow_config - Config variable found:
validation.summary.afterText - nextflow_config - Config variable (correctly) not found:
params.nf_required_version - nextflow_config - Config variable (correctly) not found:
params.container - nextflow_config - Config variable (correctly) not found:
params.singleEnd - nextflow_config - Config variable (correctly) not found:
params.igenomesIgnore - nextflow_config - Config variable (correctly) not found:
params.name - nextflow_config - Config variable (correctly) not found:
params.enable_conda - nextflow_config - Config variable (correctly) not found:
params.max_cpus - nextflow_config - Config variable (correctly) not found:
params.max_memory - nextflow_config - Config variable (correctly) not found:
params.max_time - nextflow_config - Config variable (correctly) not found:
params.validationFailUnrecognisedParams - nextflow_config - Config variable (correctly) not found:
params.validationLenientMode - nextflow_config - Config variable (correctly) not found:
params.validationSchemaIgnoreParams - nextflow_config - Config variable (correctly) not found:
params.validationShowHiddenParams - nextflow_config - Config
timeline.enabledhad correct value:true - nextflow_config - Config
report.enabledhad correct value:true - nextflow_config - Config
trace.enabledhad correct value:true - nextflow_config - Config
dag.enabledhad correct value:true - nextflow_config - Config
manifest.namebegan withnf-core/ - nextflow_config - Config variable
manifest.homePagebegan with https://github.com/nf-core/ - nextflow_config - Config
dag.fileended with.html - nextflow_config - Config variable
manifest.nextflowVersionstarted with >= or !>= - nextflow_config - Config
manifest.versionends indev:3.5.0dev - nextflow_config - Config
params.custom_config_versionis set tomaster - nextflow_config - Config
params.custom_config_baseis set tohttps://raw.githubusercontent.com/nf-core/configs/master - nextflow_config - Lines for loading custom profiles found
- nextflow_config - nextflow.config contains configuration profile
test - nextflow_config - Config default value correct: params.step= mapping
- nextflow_config - Config default value correct: params.split_fastq= 50000000
- nextflow_config - Config default value correct: params.nucleotides_per_second= 200000
- nextflow_config - Config default value correct: params.clip_r1= 0
- nextflow_config - Config default value correct: params.clip_r2= 0
- nextflow_config - Config default value correct: params.three_prime_clip_r1= 0
- nextflow_config - Config default value correct: params.three_prime_clip_r2= 0
- nextflow_config - Config default value correct: params.trim_nextseq= 0
- nextflow_config - Config default value correct: params.length_required= 15
- nextflow_config - Config default value correct: params.group_by_umi_strategy= Adjacency
- nextflow_config - Config default value correct: params.aligner= bwa-mem
- nextflow_config - Config default value correct: params.ascat_min_base_qual= 20
- nextflow_config - Config default value correct: params.ascat_min_counts= 10
- nextflow_config - Config default value correct: params.ascat_min_map_qual= 35
- nextflow_config - Config default value correct: params.cf_coeff= 0.05
- nextflow_config - Config default value correct: params.cf_contamination= 0
- nextflow_config - Config default value correct: params.cf_minqual= 0
- nextflow_config - Config default value correct: params.cf_mincov= 0
- nextflow_config - Config default value correct: params.cf_ploidy= 2
- nextflow_config - Config default value correct: params.sentieon_haplotyper_emit_mode= variant
- nextflow_config - Config default value correct: params.sentieon_dnascope_emit_mode= variant
- nextflow_config - Config default value correct: params.sentieon_dnascope_pcr_indel_model= CONSERVATIVE
- nextflow_config - Config default value correct: params.dbnsfp_fields= rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF
- nextflow_config - Config default value correct: params.vep_custom_args= --everything --filter_common --per_gene --total_length --offline --format vcf
- nextflow_config - Config default value correct: params.vep_version= 111.0-0
- nextflow_config - Config default value correct: params.vep_out_format= vcf
- nextflow_config - Config default value correct: params.igenomes_base= s3://ngi-igenomes/igenomes/
- nextflow_config - Config default value correct: params.genome= GATK.GRCh38
- nextflow_config - Config default value correct: params.snpeff_cache= s3://annotation-cache/snpeff_cache/
- nextflow_config - Config default value correct: params.vep_cache= s3://annotation-cache/vep_cache/
- nextflow_config - Config default value correct: params.custom_config_version= master
- nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
- nextflow_config - Config default value correct: params.test_data_base= https://raw.githubusercontent.com/nf-core/test-datasets/sarek3
- nextflow_config - Config default value correct: params.seq_platform= ILLUMINA
- nextflow_config - Config default value correct: params.publish_dir_mode= copy
- nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
- nextflow_config - Config default value correct: params.validate_params= true
- nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
- files_unchanged -
.gitattributesmatches the template - files_unchanged -
.prettierrc.ymlmatches the template - files_unchanged -
CODE_OF_CONDUCT.mdmatches the template - files_unchanged -
LICENSEmatches the template - files_unchanged -
.github/.dockstore.ymlmatches the template - files_unchanged -
.github/CONTRIBUTING.mdmatches the template - files_unchanged -
.github/ISSUE_TEMPLATE/bug_report.ymlmatches the template - files_unchanged -
.github/ISSUE_TEMPLATE/config.ymlmatches the template - files_unchanged -
.github/ISSUE_TEMPLATE/feature_request.ymlmatches the template - files_unchanged -
.github/PULL_REQUEST_TEMPLATE.mdmatches the template - files_unchanged -
.github/workflows/branch.ymlmatches the template - files_unchanged -
.github/workflows/linting_comment.ymlmatches the template - files_unchanged -
.github/workflows/linting.ymlmatches the template - files_unchanged -
assets/email_template.htmlmatches the template - files_unchanged -
assets/email_template.txtmatches the template - files_unchanged -
assets/sendmail_template.txtmatches the template - files_unchanged -
docs/README.mdmatches the template - readme - README Nextflow minimum version badge matched config. Badge:
24.04.2, Config:24.04.2 - readme - README Zenodo placeholder was replaced with DOI.
- plugin_includes - No wrong validation plugin imports have been found
- pipeline_name_conventions - Name adheres to nf-core convention
- schema_lint - Schema lint passed
- schema_lint - Schema title + description lint passed
- schema_lint - Input mimetype lint passed: 'text/csv'
- schema_params - Schema matched params returned from nextflow config
- system_exit - No
System.exitcalls found - actions_schema_validation - Workflow validation passed: linting.yml
- actions_schema_validation - Workflow validation passed: branch.yml
- actions_schema_validation - Workflow validation passed: fix-linting.yml
- actions_schema_validation - Workflow validation passed: release-announcements.yml
- actions_schema_validation - Workflow validation passed: template_version_comment.yml
- actions_schema_validation - Workflow validation passed: download_pipeline.yml
- actions_schema_validation - Workflow validation passed: ncbench.yml
- actions_schema_validation - Workflow validation passed: ci.yml
- actions_schema_validation - Workflow validation passed: cloudtest.yml
- actions_schema_validation - Workflow validation passed: pytest.yml
- actions_schema_validation - Workflow validation passed: clean-up.yml
- actions_schema_validation - Workflow validation passed: linting_comment.yml
- merge_markers - No merge markers found in pipeline files
- modules_json - Only installed modules found in
modules.json - multiqc_config -
assets/multiqc_config.ymlfound and not ignored. - multiqc_config -
assets/multiqc_config.ymlcontainsreport_section_order - multiqc_config -
assets/multiqc_config.ymlcontainsexport_plots - multiqc_config -
assets/multiqc_config.ymlcontainsreport_comment - multiqc_config -
assets/multiqc_config.ymlfollows the ordering scheme of the minimally required plugins. - multiqc_config -
assets/multiqc_config.ymlcontains a matching 'report_comment'. - multiqc_config -
assets/multiqc_config.ymlcontains 'export_plots: true'. - modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
- base_config -
conf/base.configfound and not ignored. - base_config -
UNZIPfound inconf/base.configand Nextflow scripts. - base_config -
FASTQCfound inconf/base.configand Nextflow scripts. - base_config -
FASTPfound inconf/base.configand Nextflow scripts. - base_config -
BWAMEM1_MEMfound inconf/base.configand Nextflow scripts. - base_config -
CNVKIT_BATCHfound inconf/base.configand Nextflow scripts. - base_config -
GATK4_MARKDUPLICATESfound inconf/base.configand Nextflow scripts. - base_config -
GATK4_APPLYBQSRfound inconf/base.configand Nextflow scripts. - base_config -
MOSDEPTHfound inconf/base.configand Nextflow scripts. - base_config -
STRELKAfound inconf/base.configand Nextflow scripts. - base_config -
SAMTOOLS_CONVERTfound inconf/base.configand Nextflow scripts. - base_config -
GATK4_MERGEVCFSfound inconf/base.configand Nextflow scripts. - base_config -
MULTIQCfound inconf/base.configand Nextflow scripts. - nfcore_yml - Repository type in
.nf-core.ymlis valid:pipeline - nfcore_yml - nf-core version in
.nf-core.ymlis set to the latest version:3.0.2
Run details
- nf-core/tools version 3.0.2
- Run at
2024-10-30 09:16:39
@nf-core-bot fix linting :pray: pretty please :pray:
@pmoris I updated your PR with the latest update in this function. No need to check for paired samples as sarek only handles paired samples
Can you update the CHANGELOG
Changelog is updated!
I also fixed the conditional (by removing the meta.single_end check, it accidentally moved the negation to the flowcell variable, causing the check to not trigger at the right time).
Lastly, what are your thoughts on updating the PU field to flowcell.lane rather than just lane (as recommended here: https://support.sentieon.com/appnotes/read_groups/)?
Why is the linter complaining? There is no trailing whitespace or non-multiple-of-4 padding as far as I can tell...