genomad
genomad copied to clipboard
Question on fundamentals
Hi. I attemps to identify the viral entities from a group of the bacterial and archaeal genomes deposited in GTDB.
Identification of viral sequences from GTDB using Genomad exhibited lots of viral sequences (not provirus) from bacterial MAGs. I would like to ask two questions on this phenomenon. What is the reason that viral sequences (not provirus) were found from MAGs ? Is there any relationship between the viral sequences and MAG (e.g. host-virus interaction) ?
I have an additional question on blast match for host prediction between CRISPR spacer and viral genomes. As CRISPR-encoding genomes have several different sequences of CRISPR spacers, dissimilar spacers may match dissimilar viral genomes. Does this mean that either "multiple viruses matchaed with CRISPR spacers can have the common host" or "one bacteria can interact with multiple viruses" ? How can I interpret it ?
I'd say those are most likely misbinned contigs. It is well known that MGEs bin badly.
As for the second question, you can interpret that as multiple infection events from different phages. You need to be very careful with the spacer matches, though. They need to be a perfect or close to perfect alignment.