usegalaxy-playbook icon indicating copy to clipboard operation
usegalaxy-playbook copied to clipboard

Htseq_count can fail with RNA STAR input

Open jennaj opened this issue 7 years ago • 6 comments

Tracking ticket Once the problem is resolved and Main updated (as needed) we can close this out.

Workaround Use HISAT2 instead of RNA STAR.

Example error

Fatal error: Unknown error occured
[bam_sort_core] merging from 32 files...
100000 GFF lines processed.
200000 GFF lines processed.
300000 GFF lines processed.
400000 GFF lines processed.
500000 GFF lines processed.
600000 GFF lines processed.
700000 GFF lines processed.
741207 GFF lines processed.
Error occured when processing SAM input (record #894 in file name_sorted_alignment.bam):
  unsigned byte integer is less than minimum
  [Exception type: OverflowError, raised in csamtools.pyx:2308]

Potentially the root issue: https://www.biostars.org/p/147487/

Comments from @natefoo:

I believe the message is coming from the version of pysam in use by htseq (not samtools as used by the tool or pysam in the Galaxy framework). But it looks like we are using the latest htseq dependency supported by the IUC tool, 0.6.1.post1 (even though we're still using the tool from Lance's repo):

https://github.com/galaxyproject/tools-iuc/blob/6f82cbc16053cecdf58d15a8d0fcdeac7991abaf/tools/htseq_count/htseq-count.xml#L4

I'd pass this on to the IUC to see if they have any ideas.

jennaj avatar Jan 25 '18 21:01 jennaj