RepeatMasker icon indicating copy to clipboard operation
RepeatMasker copied to clipboard

"species not known" for some ambiguous species names

Open jebrosen opened this issue 3 years ago • 2 comments

Hi, I have the same issue "species not known"; though it works for 'human', but not for the 'drosophila' and 'anopheles'. Manual installation Repbase is not installed Dfam release 3.4 ./RepeatMasker -engine wublast -s drosophila my.fa ./RepeatMasker -engine wublast -s Drosophila my.fa

RepeatMasker version 4.1.2-p1
Search Engine: ABBlast/WUBlast [ 3.0 ]

Using Master RepeatMasker Database: /mycomputer/RepeatMasker/Libraries/RepeatMaskerLib.h5
  Title    : Dfam
  Version  : 3.4
  Date     : 2021-07-21
  Families : 281,951



Species "drosophila" is not known to RepeatMasker.  There may
not be any TE families defined in the libraries for this
species/clade or there may be an error in the spelling.
Please check your entry against the NCBI Taxonomy database
and/or try using a broader clade or related species instead.
The full list of species/clades defined in the library may be
obtained using the famdb.py script.

Originally posted by @RadPa in https://github.com/rmhubley/RepeatMasker/issues/122#issuecomment-895757308

jebrosen avatar Aug 11 '21 17:08 jebrosen

It looks like this is because there are multiple taxa with the same name (Drosophila is both a genus and a subgenus; Anopheles is a genus, subgenus, and series). RepeatMasker used to handle this fine, but apparently not now. To work around this problem, you can use one of these more precise names with -species: drosophila_flies_genus, anopheles_genus.

In past versions, RepeatMasker used a built-in list of special species names, including both "drosophila" and "anopheles", to make sure those were always interpreted correctly. However, for this particular step it looks like maybe that list isn't being used anymore. @rmhubley do you know why this is going wrong? I remember updating Taxonomy.py to make sure it handled the synonyms whenever it invoked famdb.py, but maybe something else has changed and it isn't referring to them anymore some time when it should?

jebrosen avatar Aug 11 '21 17:08 jebrosen

Worked, Thank you.

RadPa avatar Aug 13 '21 06:08 RadPa