sarek icon indicating copy to clipboard operation
sarek copied to clipboard

Update spring/decompress-module

Open asp8200 opened this issue 1 year ago • 10 comments

Updating the spring/decompress-module. This PR should change nothing with regards to published files, but the SPRING_DECOMPRESS was being called rather confusingly on this input (one sample with two spring-files - one with R1 and the other with R2).

In this test, dev-Sarek calls spring on the spring-file containing just the R1-reads (test_1.fastq.gz.spring):

spring \
        -d \
        -g \
        -t 2 \
         \
        -i test_1.fastq.gz.spring \
        -o test_1_R1.fastq.gz test_1_R2.fastq.gz

All reads go to test_1_R1.fastq.gz while test_1_R2.fastq.gz doesn't even get written.

That is not terrible, but then the same pipeline-run also contains the following call of SPRING_DECOMPRESS on test_2.fastq.gz.spring which contains all the R2-reads:

spring \
        -d \
        -g \
        -t 2 \
         \
        -i test_2.fastq.gz.spring \
        -o test_2_R1.fastq.gz test_2_R2.fastq.gz

Here, again, all reads go into test_2_R1.fastq.gz and still test_2_R2.fastq.gz isn’t written. That seems suboptimal to me.

With the updated module, we get

spring \
    -d \
    -g \
    -t 2 \
     \
    -i test_1.fastq.gz.spring \
    -o test_1.fastq.gz

and

    spring \
    -d \
    -g \
    -t 2 \
     \
    -i test_2.fastq.gz.spring \
    -o test_2.fastq.gz

which I find much more sensible.

TO-DO:

  • [x] Double-check that the pipeline handles "type mixed" samplesheets well, and - if so - update the docs accordingly. (Moved to issue https://github.com/nf-core/sarek/issues/1576.)
  • [x] Update changelog.

asp8200 avatar Jun 19 '24 16:06 asp8200

nf-core lint overall result: Passed :white_check_mark: :warning:

Posted for pipeline commit 455b0a2

+| ✅ 200 tests passed       |+
#| ❔  12 tests were ignored |#
!| ❗   3 tests had warnings |!

:heavy_exclamation_mark: Test warnings:

  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

:grey_question: Tests ignored:

  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: conf/modules.config
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_dark.png
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • actions_ci - actions_ci
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/sarek/sarek/.github/workflows/awstest.yml
  • template_strings - template_strings
  • modules_config - modules_config

:white_check_mark: Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-07-01 14:02:29

github-actions[bot] avatar Jun 19 '24 16:06 github-actions[bot]

Can we instead update the modules upstream and add something like?

def output = meta.single_end || meta.one_strand ? "-o ${prefix}.fastq.gz" : "-o ${prefix}_R1.fastq.gz ${prefix}_R2.fastq.gz"

maxulysse avatar Jun 19 '24 16:06 maxulysse

Can we instead update the modules upstream and add something like?

def output = meta.single_end || meta.one_strand ? "-o ${prefix}.fastq.gz" : "-o ${prefix}_R1.fastq.gz ${prefix}_R2.fastq.gz"

You mean use your definition of output in the version of spring/decompress over in nf-core/modules? (That is fine with me.)

asp8200 avatar Jun 19 '24 16:06 asp8200

Can we instead update the modules upstream and add something like?

def output = meta.single_end || meta.one_strand ? "-o ${prefix}.fastq.gz" : "-o ${prefix}_R1.fastq.gz ${prefix}_R2.fastq.gz"

You mean use your definition of output in the version of spring/decompress over in nf-core/modules? (That is fine with me.)

No, but add meta.one_strand in for each spring file when there is a pair.

maxulysse avatar Jun 19 '24 17:06 maxulysse

Can we instead update the modules upstream and add something like?

def output = meta.single_end || meta.one_strand ? "-o ${prefix}.fastq.gz" : "-o ${prefix}_R1.fastq.gz ${prefix}_R2.fastq.gz"

You mean use your definition of output in the version of spring/decompress over in nf-core/modules? (That is fine with me.)

No, but add meta.one_strand in for each spring file when there is a pair.

Ok, I'll try if I can get that working. It should be possible.

asp8200 avatar Jun 19 '24 17:06 asp8200

Because technically using meta.single_end would make sense, but we don't have single_end data, just the one strand with this file, so something like meta.one_strand would be better in our case. But if you like one_direction so much we can argue in the PR on the modules repo.

maxulysse avatar Jun 19 '24 17:06 maxulysse

Because technically using meta.single_end would make sense, but we don't have single_end data, just the one strand with this file, so something like meta.one_strand would be better in our case. But if you like one_direction so much we can argue in the PR on the modules repo.

@maxulysse and @FriederikeHanssen : So I replaced ext.one_direction with meta.one_strand. Currently, I just did a local module patch, but - if you prefer - I can also update the module over in the modules-repo.

asp8200 avatar Jun 20 '24 06:06 asp8200

Because technically using meta.single_end would make sense, but we don't have single_end data, just the one strand with this file, so something like meta.one_strand would be better in our case. But if you like one_direction so much we can argue in the PR on the modules repo.

@maxulysse and @FriederikeHanssen : So I replaced ext.one_direction with meta.one_strand. Currently, I just did a local module patch, but - if you prefer - I can also update the module over in the modules-repo.

I'd like an updated module in the modules repo.

I'd rather avoid patching modules as much as possible for future plans and reasons

maxulysse avatar Jun 20 '24 07:06 maxulysse

Because technically using meta.single_end would make sense, but we don't have single_end data, just the one strand with this file, so something like meta.one_strand would be better in our case. But if you like one_direction so much we can argue in the PR on the modules repo.

@maxulysse and @FriederikeHanssen : So I replaced ext.one_direction with meta.one_strand. Currently, I just did a local module patch, but - if you prefer - I can also update the module over in the modules-repo.

I'd like an updated module in the modules repo.

I'd rather avoid patching modules as much as possible for future plans and reasons

https://github.com/nf-core/modules/pull/5850

So like that?

asp8200 avatar Jun 20 '24 08:06 asp8200

Can we separate the spring modules stuff and whether sarek is able to handle a mixed bag of inputs? Can just have an issue were w collect the testing that has been done on that

FriederikeHanssen avatar Jun 25 '24 08:06 FriederikeHanssen