charcoal icon indicating copy to clipboard operation
charcoal copied to clipboard

allow lineage csv to contain additional genomes

Open bluegenes opened this issue 4 years ago • 0 comments

Currently, if a genome is present in the lineages file, but not in the genomes txt file, you get the following error.

charcoal run newproject.conf -n
** WARNING: lineage was provided for unknown genome TARA_RED_MAG_00125.fa
** in provided lineages file example-genomes/provided-lineages.csv
** (TARA_RED_MAG_00125.fa not in newproject.genome-list.txt)
SystemExit in line 94 of /home/ntpierce/charcoal/charcoal/Snakefile:
-1
  File "/home/ntpierce/charcoal/charcoal/Snakefile", line 94, in <module>
Error in snakemake invocation: Command '['snakemake', '-s', '/home/ntpierce/charcoal/charcoal/Snakefile', '--use-conda', '-j', '1', '-n', '--configfile', '/home/ntpierce/charcoal/charcoal/conf/defaults.conf', '/home/ntpierce/charcoal/charcoal/conf/system.conf', 'newproject.conf']' returned non-zero exit status 1.

(here I just added a random genome name to example-genomes/provided-lineages.csv)

Specifically this was annoying to me because my test subset wasn't just a head of my files, as I wanted to include some non-euks and some euks. Not a big deal to do some additional grepping for test sets, but I think the utility of providing a database, rather than an exact file might be worthwhile. Also this was my intuition when looking at the required files, so it might also be intuitive for other folks, depending on how weird you think I am :).

bluegenes avatar Jun 05 '20 15:06 bluegenes