funannotate icon indicating copy to clipboard operation
funannotate copied to clipboard

Request add CodingQuarry Pathogen mode for post-fungal gene annotation, 2nd round RIP/effector-like gene predictionss

Open JamesHane opened this issue 1 year ago • 1 comments

Hi,

It is very nice to see our tool CodingQuarry (https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1344-4) has been incorporated into the fungal pipeline.

I would like to request that the pathogen mode of this same tool also be incorporated. CQPM is described here (https://espace.curtin.edu.au/bitstream/handle/20.500.11937/1767/246572_Testa%202016.pdf?sequence=2&isAllowed=y). CQPM is intended to be run in a 2nd round of gene annotation, as it was designed to predict: a) effector-like genes, and/or b) genes with abnormal GC-content/codon usage, as would be expected to reside in repetitive, AT-rich, heavily RIP-mutated genome regions

it does this by predicting both a primary gene model that represents the majority of core-conserved genes, and a secondary gene model which has different properties to the first model, but is also trained on transcriptome supported alignments. this secondary model is applied to genome regions in between a first round of gene predictions, and the resulting CQPM predicitons can be merged into a final set of annotations

the thesis link above was able to demonstrate that CQPM can "rescue" abnormal effector genes from being missed by other gene prediction methods so for many fungi that have RIP, and also for many pathogens in which we seek to predict effector/pathogenicity genes, CQPM is an important tool

we currently use CQPM in our internal pipelines, but due to the popularity and ease of using funannotate, we would be keen to see CQPM be incorporated into an updated version.

Thanks for your time and efforts, James Hane Centre for Crop and Disease Management Curtin University, Australia

JamesHane avatar May 03 '23 07:05 JamesHane

it would help if you can provide the cmdline options you are asking to be generated -- I think at one point we explored this but I am not sure if there was a problem. is there a test example of a genome that this could be applied towards to get some confirmation of how the parameter changes produce improved result too?

hyphaltip avatar Jun 22 '23 01:06 hyphaltip