Augustus
Augustus copied to clipboard
UTF-8 with BOM encoded training file
Augustus training jobs fails when using a single sequence training fasta files encoded in UTF-8 with BOM. The subroutine formatDetector() in https://github.com/Gaius-Augustus/Augustus/blob/master/scripts/helpMod.pm#L153 demand '>' as first character but fails when a BOM is found and just one sequence is defined. I don't check if there is any other problematic code.
By the way, the distinction between "fasta-prot" and "fasta-dna" doesn't work, as even DNA fasta files are determined as "fasta-prot".