Using compressed files as raw data
When using .gz or any other compressed fastq files as input, the script successfully find the zipped files and decompresses the first one, but then fails to use it. The way to make it work is to include the DECOMPRESSED file name in the metadata file. I think it is because on the banzai.sh file line 312, READ[j] still looks for the original input file ($CURRENT_FILE[j]) rather than the output of "${ZIPPER}"-d "${FILEPATH}". It can be solved either by stressing in the documentation that columns file1 and file2 must include the .fastq filename or deleting the last extension from the filename or renaming the output from the zipper
Gotcha. Dealing with unzipped versus unzipped files became a problem because compressed input was incompatible with one of the versions of PEAR. I will revisit this, and I think deal with it by requiring whichever version of PEAR supports compressed input. Cool?
Sounds good to me. It will only affect users with old versions of PEAR, right?
On 10 February 2017 at 11:03, jimmyodonnell [email protected] wrote:
Gotcha. Dealing with unzipped versus unzipped files became a problem because compressed input was incompatible with one of the versions of PEAR. I will revisit this, and I think deal with it by requiring whichever version of PEAR supports compressed input. Cool?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jimmyodonnell/banzai/issues/9#issuecomment-279035284, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6pb2gW16Iqf6hFwNuVh0NLAyS_tWmYks5rbLRwgaJpZM4L9lwW .
Yep, that's the assumption, but I will double check.
Note to self about what this entails:
- checking for consistency between metadata and actual filenames
- using zipped files directly as input to PEAR
- requiring PEAR be > v0.9.6