GetOrganelle icon indicating copy to clipboard operation
GetOrganelle copied to clipboard

Clarification on --disentangle-df: How is the depth factor used in filtering?

Open microbpro-cat opened this issue 6 months ago • 0 comments

First check

  • [ ✅] I used the GitHub search to find a similar issue or discussion and didn't find it.
  • [✅ ] I searched GetOrganelle.wiki context, especially the FAQ and browsed the examples to confirm it is unexpected to happen.
  • [ ✅] I have updated GetOrganelle to the latest released version GitHub release

Please ask questions in the Question in GitHub Discussions unless it is a feature request or bug report.

Hi, I'm currently using GetOrganelle to assemble a parasitic flatworm mitochondrial genome using a customized label database. I would like to ask about the internal implementation and role of the --disentangle-df option.

I understand from the paper that GetOrganelle filters out contigs with coverage values that deviate significantly from the target-anchor contigs. However, in the source code (get_organelle_from_reads.py, assembly_parser.py), I saw that --disentangle-df is passed as hard_cov_threshold or min_cov_folds, and used to remove low-coverage contigs.

My questions are:

How exactly is --disentangle-df applied in filtering? (e.g., is it based on the coverage of the top-weighted anchor contig?)

Is the filtering applied to both low-coverage and high-coverage outliers?

Is this value used in any way during Gaussian Mixture Model steps?

For animal mitogenomes (especially parasitic species), do you recommend adjusting this parameter to a lower value (e.g., 3~5)?

Thanks for your excellent tool, and I’d appreciate any clarification!

microbpro-cat avatar Jun 20 '25 05:06 microbpro-cat