anvio icon indicating copy to clipboard operation
anvio copied to clipboard

[FEATURE REQUEST] Report contig circularity

Open hdore opened this issue 9 months ago • 4 comments

The need

Hi Anvi'o! I was reading through the methods of this paper "Diverse plasmid systems and their ecology across human gut metagenomes revealed by PlasX and MobMess" (https://doi.org/10.1038/s41564-024-01610-3) and thought that the contig circularity test could be a function in Anvi'o. It would be useful for people wishing to check the circularity of any element, from viruses to plasmids to complete genomes.

The solution

I was thinking something similar to anvi-report-inversions, maybe using anvi-profile --fetch-filter to get the REV/FWD read pairs and then a function to filter cases with sufficiently long insert size.

On Discord, here is what @meren suggested:

It was always a plan to include something in the anvi'o profiler so it also tests for those REV/FWD read pairs or paired-end reads with exceptionally long inserts to see if there is enrichment for them in contig extremities. Everything is in place, it just requires some additional work that we never had the time to do.

I imagine a new class that takes in the key data (contigs-db, BAM file(s), and return a dictionary that reports circular contigs (i.e., '{'contig_id_01': {'num_distant_pairs': INT, 'coverage': FLOAT, 'cov_ratio': FLOAT}, 'contig_id_02': {...}, ...}, where cov_ratio is the comparison of the average overage of the circularity supporting reads to the q2q3 coverage of the entire contigs with regularly mapped reads).

A question is how to report these data or where to store it. Here are a few alternatives I can think of:

  • A standalone program: Something like anvi-report-circular-contigs that takes in a contigs-db and one or more BAM files, and reports a TAB-delimited data with relevant columns that can be easily imported into any anvi'o database as a collection for downstream analyses.

  • A function in anvi'o profiler that is called when anvi-profile is run with a flag (i.e., --test-circularity) or unless it is run with a flag (i.e., --skip-test-circularity) results from which updates a table in contigs-db that is being profiled so as additional samples are profiled using the same contigs-db, newly observed circularity signal would update the table of 'circular contigs'. In this case the circular contigs would be a part of the anvi-summarize, can be exported by anvi-export-table as a TAB-delimited file, and/or can be displayed in the interactive interface with a new layer whenever it is applicable (i.e., 'Circularity').

  • Both 🙂 Once there is a class in anvi'o, it can be used in multiple ways of course to do both of these.

Beneficiaries

Anyone wanting to check circularity of their contig. This feature request was triggered by the paper using that information to organize plasmids with MobMess, but this would be useful to check for circularity of other genetic elements such as viruses or even whole bacterial genomes, since long reads should allow us to get more an more circular, complete genomes.

hdore avatar Mar 10 '25 15:03 hdore

I think this was addressed by @meren in https://github.com/merenlab/anvio/pull/2497. If that is correct, then @meren or @hdore should feel free to close this issue :)

ivagljiva avatar Nov 28 '25 09:11 ivagljiva

Thanks, Iva! I am sorry for not tagging this request.

The program does not fully address the need described here since it is not a part of anvi-profile. The class is written in a way that it could easily integrate into anvi-profile and all the downstream tables, but nothing has been done about it yet.

But perhaps we still can close this particular issue, saying that https://anvio.org/help/main/programs/anvi-report-circularity/ and https://anvio.org/help/main/artifacts/contig-circularity-report-txt/ are good starting points, and solve the best integration path in another issue.

meren avatar Nov 28 '25 09:11 meren

Ah, I see. :) In my opinion, for partially-addressed cases like this we should keep the issue open so we don't forget to come back and do the integration later. Perhaps we could update the title with "(partially-solved)" or something so that we know it has been looked at and something done about it, but there is more to come?

ivagljiva avatar Nov 28 '25 09:11 ivagljiva

Hi @ivagljiva and @meren !

Thanks @meren for adding this feature to anvi'o. As far as I'm concerned, I'm fine with the stand-alone program, but I'll let you decide if you want to keep the issue open :)

I'll try this new program soon(ish)!

All the best

hdore

hdore avatar Nov 28 '25 10:11 hdore