prokka
prokka copied to clipboard
strange characters in intermediate files lead to failure in generating gbf output
Hello!
I used prokka for annotation of a genome. Here is my command:
prokka 1.fna --addgenes --addmrna --cpus 16 --genus Wolbachia --usegenus --species pipientis --strain wMelCS
It yields the well-known (#179) problem:
[16:06:23] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14.6 from https://github.com/tseemann/prokka' -Z PROKKA_12072021\/PROKKA_12072021\.err -i PROKKA_12072021\/PROKKA_12072021\.fsa 2> /dev/null [16:06:23] Deleting unwanted file: PROKKA_12072021/errorsummary.val [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.dr [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.fixedproducts [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.ecn [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.val [16:06:23] Repairing broken .GBK output that tbl2asn produces... [16:06:23] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < PROKKA_12072021\/PROKKA_12072021\.gbf > PROKKA_12072021\/PROKKA_12072021\.gbk sh: 1: cannot open PROKKA_12072021/PROKKA_12072021.gbf: No such file [16:06:23] Could not run command: sed 's/COORDINATES: profile/COORDINATES:profile/' < PROKKA_12072021\/PROKKA_12072021\.gbf > PROKKA_12072021\/PROKKA_12072021\.gbk
Digging deep into PROKKA_12072021/PROKKA_12072021.fsa reveals that it contains lines like that:
>JACSNK010000002.1 [gcode=11] [organism=Wolbachia pipientis] [strain=wMelCS^M]
Notice '^M' character, which is apparently a carriage return symbol. There are no '^M's in 1.fna I must admit! So this "carriage return"s were generated while performing the PROKKA routine. I played around with the options and found out that the output for just
prokka 1.fna --addgenes --addmrna --cpus 16 --genus Wolbachia --usegenus
seems to be fine. Annotation finished successfully. So it seems to be an issue of specifying the strain and/or species and should be considered a bug. I hope it will be addressed in the future versions of PROKKA. Thank you for a very useful software, by the way!