gatk icon indicating copy to clipboard operation
gatk copied to clipboard

Picard FilterSamReads exit codes writing to stdout

Open GATKSupportTeam opened this issue 4 years ago • 2 comments

Summary

A user wrote in to the forum regarding running FilterSamReads through GATK and the output bam file has formatting issues. After they sent in a bug report, I found that the exit code is getting written to the bam file, causing this issue.

This request was created from a contribution made by rcorbett on February 03, 2021 23:47 UTC.

Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/360076905711-filterSamReads-stdout-format-error

GATK Info

FilterSamReads 4.1.9.0 and 4.0.10.0 Command to stdout: gatk FilterSamReads -I subsampled.bam -O /dev/stdout --READ_LIST_FILE read_names.txt --FILTER excludeReadList --VALIDATION_STRINGENCY SILENT --QUIET > test_stdout.bam Log:

Using GATK jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar FilterSamReads -I subsampled.bam -O /dev/stdout --READ_LIST_FILE read_names.txt --FILTER excludeReadList --VALIDATION_STRINGENCY SILENT --QUIET
20:54:45.405 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.1.9.0-SNAPSHOT-local.jar!/com/intel/gkl/native/libgkl_compression.so
INFO	2021-02-12 20:54:45	FilterSamReads	Filtering [presorted=true] subsampled.bam -> OUTPUT=stdout [sortorder=coordinate]
INFO	2021-02-12 20:54:45	SAMFileWriterFactory	Unknown file extension, assuming BAM format when writing file: file:///dev/stdout
INFO	2021-02-12 20:54:45	FilterSamReads	6 SAMRecords written to stdout

Check file: gunzip -c -d -f test_stdout.bam | head -n 5

Tool returned:
0

?[[lW?m?$?^?q???k????zg?x}?s???mE?ޖ?r#U???ԑ/Qm'܄dkUM???????zCBB?!*?V*?#
<Q!QU?

Bam file using -0 gunzip -c -d -f test_outbam.bam | head -n 5

BAM?2@HD	VN:1.6	SO:coordinate
@SQ	SN:1	LN:249250621	AS:NCBI-Build-37	SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa
@SQ	SN:2	LN:243199373	AS:NCBI-Build-37	SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa
@SQ	SN:3	LN:198022430	AS:NCBI-Build-37	SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa
@SQ	SN:4	LN:191154276	AS:NCBI-Build-37	SP:Homo sapienUR:http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome/GRCh37-lite.fa

It looks like this issue has been discussed at https://github.com/broadinstitute/gatk/issues/4433 and https://github.com/broadinstitute/gatk/issues/4329 but this issue seems to still exist in GATK 4.1.9.0. Please let me know if you want any of these test files.

GATKSupportTeam avatar Feb 12 '21 21:02 GATKSupportTeam