emg-viral-pipeline
emg-viral-pipeline copied to clipboard
Add VirSorter2 process as an alternative to VirSorter
Added VirSorter2 as a new process that can be used instead of VirSorter
using the new flag --use_virsort2
, adapted the downstream processes (the parse
process and GFF generation script) to work with the different results of VirSorter2
. E.g.: VirSorter2
reports a confidence score (0-1) for every viral hit in the input data instead of a category.
Also had to add some changes to the GFF generation process because I always ran into file collision issues when more than one of the input samples reported significant viral sequences, since the VirSort
and VirFinder
reports have the same file name for every sample. Might also be solved by running the GFF script for every sample containing viral seqs instead of once for all samples but I already asked about this in #127, maybe I'm missing something here!