Augustus icon indicating copy to clipboard operation
Augustus copied to clipboard

UTF-8 with BOM encoded training file

Open hmehlan opened this issue 4 years ago • 0 comments

Augustus training jobs fails when using a single sequence training fasta files encoded in UTF-8 with BOM. The subroutine formatDetector() in https://github.com/Gaius-Augustus/Augustus/blob/master/scripts/helpMod.pm#L153 demand '>' as first character but fails when a BOM is found and just one sequence is defined. I don't check if there is any other problematic code.

By the way, the distinction between "fasta-prot" and "fasta-dna" doesn't work, as even DNA fasta files are determined as "fasta-prot".

hmehlan avatar Feb 28 '20 13:02 hmehlan