dammit
dammit copied to clipboard
RuntimeError: non-ACGTN characters not supported
Got this error with MMETSP transcriptomes created with different pipeline than ours. Not sure how you feel about fixing, but it's a thing.
e.g. file:
/mnt/research/ged/lisa/mmetsp/imicrobe/cds/MMETSP0784.cds.fa.fixed.fa
--- Running annotate!
Transcriptome file: MMETSP0095.cds.fa.fixed.fa
Output directory: /mnt/imicrobe/MMETSP0095.cds.fa.fixed.fa.dammit
[ ] MMETSP0095.cds.fa.fixed.fa
[ ] transcriptome_stats:MMETSP0095.cds.fa.fixed.fa
Some tasks failed![dammit.annotate:ERROR]
TaskError - taskid:transcriptome_stats:MMETSP0095.cds.fa.fixed.fa[dammit.annotate:ERROR]
PythonAction Error
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python2.7/site-packages/doit/action.py", line 383, in execute
returned_value = self.py_callable(*self.args, **kwargs)
File "/home/ubuntu/.local/lib/python2.7/site-packages/dammit/tasks.py", line 747, in cmd
lens, uniq_kmers, gc_perc, n_amb = parse(transcriptome)
File "/home/ubuntu/.local/lib/python2.7/site-packages/dammit/tasks.py", line 713, in parse
.format(contig.name, contig.sequence))
RuntimeError: non-ACGTN characters not supported. Offending transcript:
>Transcript_0
XTCGAGGCTGATCTCGAAGCTCAGGTCTTCAACTGCCCGTTGCAAATTGCCAAGAAGGGGGCAGTGCAGACTTCAGTCGCCGTGGGGGCCGCAGTGCGCGTCGGCAACATCACCGTGACCAAGCTTGGCTTTGCCACRAAAGTCGACGGCGTTGYACAACYGAACGGGCTGCATGAGTTCTCCGGAGGCATCACYGTRGACATCAGCGAYGAGTTCACCCTTGTCAGGGGACCGTTGGTTGCGGAGGGCTTYGCTGAGATTCGCATGAGCGCGAGAGCTATGGTGAGCTCCCCGACCACGTGGGTCCACAGTGTCTACTCACAGTTGCCGAGCRCGCTCATCGGCAGX
bad
[dammit.annotate:ERROR]
Huh. Almost none of dammit's downstream programs support full IUPAC ambiguity codes, and X
isn't even proper IUPAC. So, I think dammit is behaving exactly as it should in this case. It should probably support U
's though, by changing them into T
's.