bgcflow
bgcflow copied to clipboard
Feat: Pangenome graph module
This is a request for a new feature rule to use PPanGGOLiN to create pagenome graphs for the genome projects.
I think we should use the Roary-defined Gene Famillies as input to ppanggolin
. This would make sure the Gene Family IDs are consistent with EggNOG and other annotations.
This can be achieved by providing your gene families.
Step 1: Use gff annotations just like the roary input
ppanggolin annotate --anno ORGANISM_ANNOTATION_LIST
Step 2: Provide gene families
ppanggolin cluster -p pangenome.h5 --clusters MY_CLUSTERS_FILE
MY_CLUSTER_FILE should be created from Roary output.
Experimental feature are now available in: https://github.com/NBChub/bgcflow/tree/ppanggolin3
Usage:
- checkout to the experimental branch:
git checkout ppanggolin3
- add
ppanggolin
toTRUE
in the project config file - run the workflow using:
bgcflow run --snakefile workflow/Ppanggolin -n
Thanks. I will try this and let you know