orthofiller icon indicating copy to clipboard operation
orthofiller copied to clipboard

Sequences not multiple of 3: poor stderr information

Open MatteoSchiavinato opened this issue 7 years ago • 1 comments

With one of the file sets I'm using in the analysis, I get this error:

/software/python/Python2.7/lib/python2.7/site-packages/Bio/Seq.py:2071: BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before transl ation. This may become an error in future.

I checked my files and many of the coding regions in the GTF file are not multiple by 3 looking at the exon feature, because it may contain UTR, while they are if looking at the cds feature. How does the program actually handle this? Does it read the exons, unite them and then estimate the coding region by searching for start and stop codons?

Also, I think it would be helpful for the end-user to see the gene name that generated the Biopython warning in the standard error, to do an immediate check and perhaps grep it out of the file.

MatteoSchiavinato avatar May 04 '17 10:05 MatteoSchiavinato

Specifically, this is BioPython. Not sure if much can be done without altering the BioPython code

xonq avatar Feb 01 '20 18:02 xonq