FastANI icon indicating copy to clipboard operation
FastANI copied to clipboard

Multi-genome reference and query FASTAs for many-to-many queries

Open ryneches opened this issue 2 years ago • 0 comments

When comparing many small genomes, it is not possible to create individual files for each genome. For example, IMG/VR v4.1 contains 5,576,197 viral genomes. It is not possible to create this many files on most file systems, particularly in HPC environments where network file systems like NFS and Lustre are usually deployed.

For this situation, how about supplying a single FASTA file for all contigs, and a query.txt and reference.txt structured something like this?

{genome_name}\t{contig_1},{contig_2},{contig_3}....\n

ryneches avatar Oct 03 '23 07:10 ryneches