MMseqs2 icon indicating copy to clipboard operation
MMseqs2 copied to clipboard

The alignment length are always smaller than 10kb

Open xiekunwhy opened this issue 1 year ago • 0 comments

Hi,

I use mmseqs2 to align genome sequences, but I found that the alignment length are always smaller than 10kb.

Here are the steps describle how I use mmseqs2

  1. cut genome sequences into pieces, 200000bp with 10000bp overlap, and cat all pieces into one file zja.chunks.fa;
  2. create database use zja.chunks.fa, mmseqs createdb zja.chunks.fa zja.chunks.db -v 2 mmseqs createindex zja.chunks.db zja.chunks.db.tmp --search-type 3
  3. align each piece to database, mmseqs easy-search parts_zja200/Chr01-1-200000.fa zja.chunks.db Chr01-1-200000.m8 Chr01-1-200000.tmp -e 1e-10 -s 7.5 --min-seq-id 0.9 --filter-hits 1 mmseqs easy-search parts_zja200/Chr01-190001-390000.fa zja.chunks.db Chr01-190001-390000.m8 Chr01-190001-390000.tmp -e 1e-10 -s 7.5 --min-seq-id 0.9 --filter-hits 1 ......

and I found that, in all *.m8 files, the alignment length are always smaller than (<=) 10kb, and the results (hit number) is fewer than blastn/pblast/hs-blastn.

My question is how to unlimit max alignment length and how to set mmseqs parameters to make results as similar to blastn as possible?

image

Best, Kun

xiekunwhy avatar Aug 09 '22 11:08 xiekunwhy