SHOOT icon indicating copy to clipboard operation
SHOOT copied to clipboard

Multiple query genes support for SHOOT (with EPA)

Open guignonv opened this issue 1 year ago • 0 comments

This version supports multiple query genes and provides outputs by matching OGs.

There are small changes in the output files:

  • added a ".sh.map" file for correspondance between original query names and processed names (when a ".sh.cleaned" file is created)
  • ".assign.txt" now has an additional column containing the (coma-separated) list of query genes matching the OG (and the corresponding scores are also all there and coma separated)
  • ".fa.sh.msa.fa", ".fa.sh.msa.fa.query.fa", ".sh.msa.fa.ref.fa", ".fa.shoot.tree", ".sh.orthologs.tsv" and ".fa.sh.msa.fa_epa" are now prefixed with their corresponding OG names
  • ".sh.orthologs.tsv" has a new "Query" column added as first column to report the corresponding query gene
  • ".jplace" files include all the genes matching a given OG

Basically, all the query genes are grouped by matching OGs and then reintegrated in each OG in group (and not one by one).

It may need to be tested a bit more extensively by others with more dataset than mines.

guignonv avatar May 26 '23 14:05 guignonv