PPanGGOLiN icon indicating copy to clipboard operation
PPanGGOLiN copied to clipboard

Regions of Genome Plasticity

Open antoniocamillo opened this issue 2 years ago • 1 comments

Hello everyone

First I want to thank you for the development of PPanGGOLiN, it is a great tool to explore genomic data

My question would be about the prediction of regions of genome plasticity

I'm analyzing the presence in some contigs, but all the predicted regions (start and end) contemplate the entire contig.

I performed the prediction in other tools and got several regions and not just a single one, just to compare, to know if i made some mistake

Am I making a mistake in the parsing method? Is there a parameter that is missing? If possible I would like help to understand what I'm doing wrong.

I'll attach the results I got and also the file with the contigs I used for testing ... to analyze if possible

Thank you for your attention, best regards.

test_plastic_regions.zip

antoniocamillo avatar Aug 11 '22 02:08 antoniocamillo

Hi,

Sorry for the delayed response I lacked internet access for a few days.

I'm moderately confused by the fasta files that you have included in the .zip, some are extremely small (2 have 13k nt, 1 has 62k) while others have expected sizes for bacterial genomes (4-5m nt). Among those expected sizes, some of the genomes have an extremely high number of contigs (1700 or 1200 is a massive number for a bacterial genome assembly).

If that is all expected on your end, panRGP while likely fail on such datasets with so few genomes. It can cope with highly fragmented and/or contaminated datasets to an extend, but there needs to be more samples in order to work I'm afraid.

Adelme

axbazin avatar Aug 16 '22 07:08 axbazin

Closing the issue as I'm assuming this is done and over.

If it's not and you have other questions related to this don't hesitate to reopen the issue :)

axbazin avatar Feb 15 '23 16:02 axbazin