kalign
kalign copied to clipboard
A fast multiple sequence alignment program.
Kalign
Kalign is a fast multiple sequence alignment program for biological sequences.
Installation
Release Tarball
Download tarball from releases. Then:
tar -zxvf kalign-<version>.tar.gz
cd kalign-<version>
./autogen.sh
./configure
make
make check
make install
Homebrew
brew install brewsci/bio/kalign
Developer version
git clone https://github.com/TimoLassmann/kalign.git
cd kalign
./autogen.sh
./configure
make
make check
make install
on macOS, install brew then:
brew install libtool
brew install automake
git clone https://github.com/TimoLassmann/kalign.git
cd kalign
./autogen.sh
./configure
make
make check
make install
Usage
Usage: kalign -i <seq file> -o <out aln>
Options:
--format : Output format. [Fasta]
--reformat : Reformat existing alignment. [NA]
--version : Print version and exit
Kalign expects the input to be a set of unaligned sequences in fasta format or aligned sequences in aligned fasta, MSF or clustal format. Kalign automatically detects whether the input sequences are protein, RNA or DNA.
Since version 3.2.0 kalign supports passing sequence in via stdin and support alignment of sequences from multiple files.
Examples
Passing sequences via stdin:
cat input.fa | kalign -f fasta > out.afa
Combining multiple input files:
kalign seqsA.fa seqsB.fa seqsC.fa -f fasta > combined.afa
Align sequences and output the alignment in MSF format:
kalign -i BB11001.tfa -f msf -o out.msf
Align sequences and output the alignment in clustal format:
kalign -i BB11001.tfa -f clu -o out.clu
Re-align sequences in an existing alignment:
kalign -i BB11001.msf -o out.afa
Reformat existing alignment:
kalign -i BB11001.msf -r afa -o out.afa
Benchmark results
Here are some benchmark results. The code to reproduce these figures can be found at here.
Balibase
Bralibase
Homfam
Quantest2
Please cite:
- Lassmann, Timo. Kalign 3: multiple sequence alignment of large data sets. Bioinformatics (2019). pdf
Other papers:
- Lassmann, Timo, Oliver Frings, and Erik LL Sonnhammer. Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features. Nucleic acids research 37.3 (2008): 858-865. Pubmed
- Lassmann, Timo, and Erik LL Sonnhammer. Kalign: an accurate and fast multiple sequence alignment algorithm. BMC bioinformatics 6.1 (2005): 298. Pubmed