barrnap icon indicating copy to clipboard operation
barrnap copied to clipboard

archaea and bacteria 16S duplicate

Open chloelulu opened this issue 5 years ago • 1 comments

Hi, developer, Thanks for creating such efficient software. I have used it to find the 16S rRNA hits in my de-novo assembled genome bins. My purpose is to search for archaea and bacteria, so I run the result separately with -k bac and -k arc. However, the result is so confusing. For example, one of the bin found two 16S hits of archaea and also two hits of bacteria. The header of the hits are >16S_rRNA::NODE_2_length_100533_cov_5.789665:250-1687(-) and >16S_rRNA::NODE_8_length_10807_cov_5.393508:10362-10807(-) in bacteria output. The header of the hits are >16S_rRNA::NODE_2_length_100533_cov_5.789665:251-1678(-) and >16S_rRNA::NODE_8_length_10807_cov_5.393508:10363-10803(-) And I blast both fasta hits to RDP classifier, and the archaea hits outputs are 16S_rRNA::NODE_2_length_100533_cov_5.789665:251-1678(-);+;Bacteria;100%;"Bacteroidetes";98%;"Bacteroidia";96%;"Bacteroidales";96%;"Rikenellaceae";38%;Mucinivorans;33% 16S_rRNA::NODE_8_length_10807_cov_5.393508:10363-10803(-);+;Bacteria;99%;Firmicutes;70%;Clostridia;61%;Clostridiales;61%;Ruminococcaceae;43%;Hydrogenoanaerobacterium;14% Also bacteria hits outputs are 16S_rRNA::NODE_2_length_100533_cov_5.789665:250-1687(-);+;Bacteria;100%;"Bacteroidetes";98%;"Bacteroidia";94%;"Bacteroidales";94%;"Rikenellaceae";34%;Mucinivorans;24% 16S_rRNA::NODE_8_length_10807_cov_5.393508:10362-10807(-);+;Bacteria;99%;Firmicutes;78%;Clostridia;53%;Clostridiales;53%;Ruminococcaceae;40%;Hydrogenoanaerobacterium;14% So my question are - (1) The result of bacteria and archaea are the same, both are bacteria. Why they are classified into two parts, bacteria and archaea? (2) The two hits came from one genome bin, why they can be predicted and have two 16S with different taxonomy classification?

Thanks so much for your patience! Best.

chloelulu avatar Feb 11 '19 21:02 chloelulu

Yes, both bacteria and archaea share a 16S model from RFAM.

NAME  16S_rRNA
ACC   RF00177

Barrnap is designed for bacterial isolates. It was not designed to predict kingdom of MAGs.

I do not know why And I blast both fasta hits to RDP classifier gives different answers for the same identical (?) sequences.

tseemann avatar Oct 03 '19 05:10 tseemann