PPanGGOLiN
PPanGGOLiN copied to clipboard
Clarification about the contents of `gene_to_gene_family.tsv ` from projection
I have been running projection
on a reconstructed pangenome and a set of assembly FastA files for input genomes, in order to assign each gene to a gene family in the pangenome for each input genome.
I tried consulting the documentation about the output of projection
, but the link doesn't seem to go anywhere (https://github.com/labgem/PPanGGOLiN/blob/f3ba6a1f33256f19175b570c4b711bb8970d0365/docs/user/projection.md).
The documentation states that gene_to_gene_family.tsv
"provides the mapping of genes to gene families of the pangenome." I was expecting to see one line per gene for an input genome, which indicates that the gene in a line is assigned to a gene family in the reconstructed pangenome. But this isn't what I got. Instead, I got files with 100s of thousands of lines, even though an input genome contains 2.5k to 2.9k genes.
Any clarifications would be much appreciated. Thank you in advance.
Hi,
The "projection" documentation about its output files is here: https://ppanggolin.readthedocs.io/en/latest/user/projection.html#output-files
However, indeed it is right that the current behavior is not the one that was intended. I see where the bug is. Currently, the "gene_to_gene_family.tsv" file contains this information for ALL given input genomes, and not just the single input genome. The file is likely equal between the different "input genome" output directories. we'll get a fix for this in the upcoming version.
Thank you very much for the bug report.
Adelme
Thank you for the explanation. I checked whether "The file is likely equal between the different "input genome" output directories" for a few input genomes. But it didn't seem to be the case. I look forward to the updated version. Thank you.
Also, I was referring to https://github.com/labgem/PPanGGOLiN/blob/f3ba6a1f33256f19175b570c4b711bb8970d0365/docs/user/Outputs.md#gene-families-and-genes, which doesn't seem to exist anymore, in https://github.com/labgem/PPanGGOLiN/blob/f3ba6a1f33256f19175b570c4b711bb8970d0365/docs/user/projection.md
Alright thank you for the additional input, and indeed I misunderstood what you meant, I see the broken link now ! Will fix this as well.
The fix for this issue has been released in v2.1.0.