gatk
gatk copied to clipboard
Picard FilterSamReads exit codes writing to stdout
Summary
A user wrote in to the forum regarding running FilterSamReads through GATK and the output bam file has formatting issues. After they sent in a bug report, I found that the exit code is getting written to the bam file, causing this issue.
This request was created from a contribution made by rcorbett on February 03, 2021 23:47 UTC.
GATK Info
FilterSamReads 4.1.9.0 and 4.0.10.0
Command to stdout:
gatk FilterSamReads -I subsampled.bam -O /dev/stdout --READ_LIST_FILE read_names.txt --FILTER excludeReadList --VALIDATION_STRINGENCY SILENT --QUIET > test_stdout.bam
Log:
Using GATK jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar FilterSamReads -I subsampled.bam -O /dev/stdout --READ_LIST_FILE read_names.txt --FILTER excludeReadList --VALIDATION_STRINGENCY SILENT --QUIET
20:54:45.405 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar!/com/intel/gkl/native/libgkl_compression.so
INFO 2021-02-12 20:54:45 FilterSamReads Filtering [presorted=true] subsampled.bam -> OUTPUT=stdout [sortorder=coordinate]
INFO 2021-02-12 20:54:45 SAMFileWriterFactory Unknown file extension, assuming BAM format when writing file: file:///dev/stdout
INFO 2021-02-12 20:54:45 FilterSamReads 6 SAMRecords written to stdout
Check file:
gunzip -c -d -f test_stdout.bam | head -n 5
Tool returned:
0
?[[lW?m?$?^?q???k????zg?x}?s???mE?ޖ?r#U???ԑ/Qm'܄dkUM???????zCBB?!*?V*?#
<Q!QU?
Bam file using -0
gunzip -c -d -f test_outbam.bam | head -n 5
BAM?2@HD VN:1.6 SO:coordinate
@SQ SN:1 LN:249250621 AS:NCBI-Build-37 SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa
@SQ SN:2 LN:243199373 AS:NCBI-Build-37 SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa
@SQ SN:3 LN:198022430 AS:NCBI-Build-37 SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa
@SQ SN:4 LN:191154276 AS:NCBI-Build-37 SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa
It looks like this issue has been discussed at https://github.com/broadinstitute/gatk/issues/4433 and https://github.com/broadinstitute/gatk/issues/4329 but this issue seems to still exist in GATK 4.1.9.0. Please let me know if you want any of these test files.