magicblast icon indicating copy to clipboard operation
magicblast copied to clipboard

problems with really short exons (10-6 bp)

Open HegedusB opened this issue 5 years ago • 4 comments

Dear all, I am surprised how well the program is working. However, I have a question. Is it possible to fine toon the aligner in order to detect smaller exons (10-6 bp). I believe the word size is the limiting factor here. My fungal strain contains a lot of short exons. So far this is the best aligner which can handle most of them correctly, however the really short exons are not really well aligned here neither. Thanks for any helps!

HegedusB avatar Nov 05 '19 15:11 HegedusB

Thank you for giving Magic-BLAST a try. Unfortunately very short exons are still a weak side of Magic-BLAST. We continue to improve it and expect that Magic-BLAST will get better at this in future versions.

In the mean time, you are correct that word size needs to be reduced to 10 or 6 bases. However, using word size below 16, requires turning off repeat filtering, so you also need to add -limit_lookup F option. This will significantly increase run time and memory footprint.

boratyng avatar Nov 06 '19 18:11 boratyng

Thank you for the great program. I am waiting the new releases.

HegedusB avatar Nov 07 '19 19:11 HegedusB

Can the latest release (v 1.6.0) turn off repeat filtering now?

y9c avatar Jun 16 '21 06:06 y9c

@yech1990, This behavior did not change in version 1.6.0. You can turn off repeat filtering with -limit_lookup F option. It will work for aligning to transcripts, single or a few genes, small genomes (bacteria). I would not recommend it for aligning to a human genome.

boratyng avatar Jun 16 '21 14:06 boratyng