prokka icon indicating copy to clipboard operation
prokka copied to clipboard

How to create a local (custom) database with a multifasta DNA sequence file in Prokka?

Open Felipedb02 opened this issue 2 years ago • 0 comments

Hi everyone, I'm trying to create a local (custom) database for prokka based on a multifasta file that contains a set of DNA sequences of different bacterial genes. It looks like this

mecI:1:D86934 TTATTTTTTATTCAATATATTTCTCAATTCTTCTATTTCATCTTGTGATAGATCTTCTTTTTCTACAAAGTTTAAGACAAGTGAATTGAAACCGCCTTTGTATACTTTATTGATAAAGTTTTTAGATGTTTTATATTTTATATCACTTTCTTCTACAAGAGAGTAATATTGAAAAATTTTATTGTCTTTTTTACGATTA mecI:2:AB037671 TTAAAAAATTTTATTGTCTTTTTTACGATCTATAAATCCCTTTTTATACAATCTCGTTATAAGTGTACGAATGGTTTTTGGACTCCAGTCCTTTTGCATTTGTATTTCTTCTATTATATTATTCGCACTTGCATATTTTTCATCCAAATGATATTCATAACTTCCCATTCTGCAGATGATATTTCATACGTTTTATTATCCAT mecI:3:FJ670542 TTATTTTTTATTCAATATATTTCTCAATTCTTCTATTTCATCTTGTGATAGATCTTCTTTTTCTACAAAGTTTAAGACAAGTGAATTGAAACCGCCTTTGTATACTTTATTGATAAAGTTTTTAGATGTTTTATATTTTATATCACTTTCTTCTACAAGAGAGTAATATTGAAAAATTTTATTGTCTTTTTTACGATC mecI:4:FJ390057 ATGGATAATAAAACGTATGAAATATCATCTGCAGAATGGGAAGTTATGAATATCATTTGGATGAAAAAATATGCAAGTGCGAATAATATAATAGAAGAAATACAAATGCAAAAGGACTGGAGTCCAAAAACCATTCGTACACTTATAACGAGATTGTATAAAAAGGGATTTATAGATCGTAAAAAAGACAAT........

I´ve been trying to do it converting first, the inicial DNA multifasta file in to a PROTEIN multifasta file by using EMBOSS - transeq tool, then creating a blast database and indexing it in to the PROKKA database directory, and finally do the typing process of this custom database against a bacterial genome in order to get the gbk output of this process. However, I ask you for help if there is an easier way to do this because I'm loosing some sequence information caused by the EMBOSS-transeq process. May be if there is a way to do the setting the DNA database instead of the protein database.

Thank you guys.

Felipedb02 avatar Jan 15 '22 01:01 Felipedb02