pyani
pyani copied to clipboard
Add `pyani evolve` command.
Summary:
We need a way to generate benchmark test data to ensure consistency and accuracy of pyani output.
Description:
The initial plan is to take a single genome sequence as input (this may be random...) and an accompanying network representing the input sequence's evolution. Each edge describes a process happening to an input genome, and can be any of several optional processes (with appropriate parameterisation):
- random substitution
- inversion
- gain/loss of sequence from outside the network
- HGT within the network
Starting from the input genome, these processes are applied as intended in the graph.
This will generate a set of input genomes for testing pyani where we know the evolutionary history of every "leaf node" sequence, and can interpret output accordingly. The data can then be used to benchmark ANI, k-mer and other genome analyses.
pyani Version:
Planned for v0.3+
Potentially related tools:
https://github.com/soumyakundu/SaGePhy
https://github.com/xavierdidelot/ClonalFrameML
https://pubmed.ncbi.nlm.nih.gov/27713837/
https://academic.oup.com/bioinformatics/article/34/13/2308/4883490
Work on this has been started on the evolve branch.