prokka icon indicating copy to clipboard operation
prokka copied to clipboard

Can prokka annotate only from --protein file?

Open Zazgah opened this issue 3 years ago • 6 comments

Dear tseemann

I've been using prokka and it's an amazing tool. Thank you for your contribution to the scientific community.

I am wondering if I can use prokka from anotalty ONLY from the file provided by --proteins [X]

In my current scenario I only want to annotate my proteins of interest, not the ones provided in default database.s If I clear the default database, prokka still ask to generate it.

Thank you very much for your time,

Zazgah avatar Sep 16 '20 15:09 Zazgah

Hi @Zazgah

Did you come up with any solution? I have a similar task to do as well.

vappiah avatar Oct 09 '20 15:10 vappiah

Not yet.

I have just added a text tag to the proteins of interest (for example -ZazgahProteins-). And then after annotattion i've filtered all the features that containing such string.

Not the most elegant but it works!

Best,

On Fri, Oct 9, 2020 at 5:43 PM vincentappiah [email protected] wrote:

Hi @Zazgah https://github.com/Zazgah

Did you come up with any solution? I have a similar task to do as well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tseemann/prokka/issues/520#issuecomment-706255899, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOS27P4WDYEXLH73P5Q327TSJ4VSTANCNFSM4RPAO55A .

Zazgah avatar Oct 09 '20 15:10 Zazgah

Thanks. I will go with this strategy as well.

Regards, Vincent

vappiah avatar Oct 09 '20 15:10 vappiah

What does your .faa file look like? I don't know which format the header should be in for prokka to actually annotate the CDS accoridng to my .faa file.

avonm avatar Oct 21 '20 16:10 avonm

@avonm did you figure this out? I can only get --proteins to work with an faa downloaded directly from NCBI but not one that I've curated myself.

mikeyweigand avatar Nov 08 '22 22:11 mikeyweigand

@mikeyweigand I did get it to work but now I have switched to bakta. The header in the proteins file look like this:

ABV20485.1 2.7.2.4~~~thrA~~~aspartokinase/homoserine dehydrogenase I AY513487.1 ~~~cosD~~~PCFO71 CosD adhesin~~~

The last header line is from my own database and those headers were manually formated.

Hope this helps!

avonm avatar Nov 09 '22 08:11 avonm