elvers icon indicating copy to clipboard operation
elvers copied to clipboard

`get_data` target assumes input data is gzipped.

Open ctb opened this issue 6 years ago • 3 comments

which is fine, but it means that if you are starting with un-gzipped data the output of get_data is incorrect because it links the ungzipped files to filenames that end with .gz and then commands like trimmomatic barf because it's not gzipped data but it's named .gz :)

ctb avatar Jan 23 '19 19:01 ctb

I think a solution would be to gzip the data if it's not. but that makes an unnecessary copy.

or, perhaps, just barf on ungzipped data in get_data!

ctb avatar Jan 23 '19 19:01 ctb

ah, good point. Actually, all/most rules assume gz

bluegenes avatar Jan 23 '19 19:01 bluegenes

On Wed, Jan 23, 2019 at 07:18:13PM +0000, Tessa Pierce wrote:

ah, good point. Actually, all/most rules assume gz

well, trimmomatic could check the file type to see if it's gz. So in some sense it's trimmomatic's fault for "believing" the user. (khmer doesn't care what you name the file, for xample.)

I think the key issue here is that get_data is the entry point to the rest of the workflow. So it could usefully do some extra checking.

ctb avatar Jan 23 '19 19:01 ctb