ggcat
ggcat copied to clipboard
WARNING: No newline at ending of file 'nonewline.fasta' behavior
Hi, I've been running into situations where the warning WARNING: No newline at ending of file 'nonewline.fasta' from ggcat causes issues in tools that use ggcat output because the generated outputs are incorrect/different from expected.
Maybe it would be better to turn this warning into an error and abort, as no newline at the end of an input file typically indicates something has gone wrong earlier in generating the inputs and the user would likely want to fix that. Now it seems to break most tools using ggcat since they treat broken inputs differently than ggcat, so it eventually causes an error anyway.
Hi, this initially was an intended behavior, as with possibly tens of thousands of .fasta files it would have been frustrating to restart the whole analysis for one single corrupted .fasta file.
In which way are the outputs different? Maybe it could be an easy fix to uniform to existing behavior in case of a corrupted file.
If it makes sense, I can provide an additional flag to treat all warnings as errors and abort the computation.
I've encountered this in Themisto and Fulgor, which both call ggcat via the C++ API to build the unitigs for further processing. In Themisto, the warning results in a broken index that crashes during query time, and in Fulgor it corrupts the index so that some targets will never have any hits against them. So at least in the C++ API this case should be an error, as the results are unusable.
Currently the warning is printed to err using eprintln! so it's not possible to detect during construction from within the code. Having some kind of return code that indicates this occurred would be helpful if you think aborting construction is not a good idea.
Also typically a file that has this error was output from another tool that crashed during writing, so it's something that should be taken care of before construction. An error would help here.
Hi, I did an improvement in the logging, now all the log data can be processed with a callback so it is possible to terminate the program if such a warning happens
Thanks!!