bgcflow icon indicating copy to clipboard operation
bgcflow copied to clipboard

Feat: Pangenome graph module

Open OmkarSaMo opened this issue 1 year ago • 2 comments

This is a request for a new feature rule to use PPanGGOLiN to create pagenome graphs for the genome projects.

I think we should use the Roary-defined Gene Famillies as input to ppanggolin. This would make sure the Gene Family IDs are consistent with EggNOG and other annotations.

This can be achieved by providing your gene families.

Step 1: Use gff annotations just like the roary input

ppanggolin annotate --anno ORGANISM_ANNOTATION_LIST

Step 2: Provide gene families ppanggolin cluster -p pangenome.h5 --clusters MY_CLUSTERS_FILE

MY_CLUSTER_FILE should be created from Roary output.

OmkarSaMo avatar Jul 07 '23 15:07 OmkarSaMo

Experimental feature are now available in: https://github.com/NBChub/bgcflow/tree/ppanggolin3

Usage:

  • checkout to the experimental branch:
git checkout ppanggolin3
  • add ppanggolin to TRUE in the project config file
  • run the workflow using:
bgcflow run --snakefile workflow/Ppanggolin -n

matinnuhamunada avatar Jul 12 '23 12:07 matinnuhamunada

Thanks. I will try this and let you know

OmkarSaMo avatar Jul 13 '23 12:07 OmkarSaMo