modules icon indicating copy to clipboard operation
modules copied to clipboard

[FEATURE] STAR handle gzipped files

Open edmundmiller opened this issue 3 years ago • 3 comments

Is your feature request related to a problem? Please describe

I'd be cool if we didn't have to include

    withName: STAR_ALIGN {
        ext.args = '--readFilesCommand zcat'
    }

Describe the solution you'd like

    def gzip = reads.name.endsWith(".gz") ? "--readFilesCommand zcat"
    """
    STAR \\
        --genomeDir $index \\
        --readFilesIn $reads  \\
        --runThreadN $task.cpus \\
        --outFileNamePrefix $prefix. \\
        $out_sam_type \\
        $ignore_gtf \\
        $seq_center \\
        $gzip \\
        $args

I can't remember if : is necessary or not in the Elvis operator.

Describe alternatives you've considered

  • Not doing this and leaving it up to users to figure it out.
  • We might want to handle overriding that with args2 maybe or some other method that checks the ext.args for a --readFilesCommand

Additional context

No response

edmundmiller avatar May 23 '22 18:05 edmundmiller

How comes we don't have this already? This must be handled in the rnaseq pipeline somehow?

--readFilesCommand gzip -cdf should work with both compressed and uncompressed files, making the groovy conditional superflous.

EDIT: in nf-core/rnaseq it's also passed through ext.args: https://github.com/nf-core/rnaseq/blob/89bf536ce4faa98b4d50a8ec0a0343780bc62e0a/conf/modules.config#L503

grst avatar Jun 14 '22 11:06 grst

Exactly. Which works, but it's a foot gun, when we can just use the power of Nextflow to handle STAR not doing the job and handling it, or maybe just giving users too many options rather than telling them what they want. But we can fix it on our end.

edmundmiller avatar Jun 14 '22 19:06 edmundmiller

We might want to handle overriding that with args2 maybe or some other method that checks the ext.args for a --readFilesCommand

How about:

  • check if --readFilesCommand is in args
  • if not, add --readFilesCommand gzip -cdf

This would also be backwards-compatible with current implementations (e.g. rnaseq) and provides sensible defaults to work with compressed files out-of-the-box.

grst avatar Jun 15 '22 11:06 grst

I am trying to fix this, but I am stuck on this line:

def gzip = reads.getName().endsWith(".gz") ? "--readFilesCommand zcat" : ''

Nextflow stdout: │ │ │ │ ERROR ~ Module compilation error │ │ - file : │ │ ... │ │ - cause: Variable reads already defined in the process scope @ line 51, column 16. │ │ def gzip = reads.getName().endsWith(".gz") ? "--readFilesCommand zcat" :

Any suggestions?

ramirobarrantes avatar May 22 '24 21:05 ramirobarrantes

@ramirobarrantes could you link me to your updates to the module? I'm struggling without the rest of the module to know why that line is throwing an error 😅

edmundmiller avatar May 23 '24 15:05 edmundmiller

@ramirobarrantes could you link me to your updates to the module? I'm struggling without the rest of the module to know why that line is throwing an error 😅

Do you mean that you want me to push the broken changes? Or just the location of where the changes should be? I haven't updated anything as they don't work yet.

ramirobarrantes avatar May 23 '24 18:05 ramirobarrantes