RagTag
RagTag copied to clipboard
Speed RagTag up a few orders of magnitude on fragmented assemblies with caching
I noticed computing the line numbers of AGP files and objects was ~99% (!) of the runtime on our very fragmented contig assemblies.
This patch pre-computes the line numbers and dropped runtime from 16 hours to 8 minutes, the vast majority of which was spent computing the AGP file to write out.
I have checked this produces identical output on our data, but probably needs more extensive testing.
This is amazing! @Mike would you be able to review and integrate?
Thanks so much for your contribution!
Mike
On Wed, Feb 14, 2024 at 9:51 AM Travis Wrightsman @.***> wrote:
I noticed computing the line numbers of AGP files and objects was ~99% (!) of the runtime on our very fragmented contig assemblies.
This patch pre-computes the line numbers and dropped runtime from 16 hours to 8 minutes, the vast majority of which was spent computing the AGP file to write out.
I have checked this produces identical output on our data, but probably needs more extensive testing.
You can view, comment on, or merge this pull request online at:
https://github.com/malonge/RagTag/pull/178 Commit Summary
- 4b43afa https://github.com/malonge/RagTag/pull/178/commits/4b43afa019fcd5d2610dc8de4067d33d5aa5d314 Cache AGP file and object line counts
File Changes
(1 file https://github.com/malonge/RagTag/pull/178/files)
- M ragtag_utilities/AGPFile.py https://github.com/malonge/RagTag/pull/178/files#diff-52e0b172693fb269551599232c875d33a2a928e468d3846a65083dd0d552febb (16)
Patch Links:
- https://github.com/malonge/RagTag/pull/178.patch
- https://github.com/malonge/RagTag/pull/178.diff
— Reply to this email directly, view it on GitHub https://github.com/malonge/RagTag/pull/178, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP347QJFQLRNG2S2SZWNLYTTFOTAVCNFSM6AAAAABDINRPG2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZTINJSGY3DONI . You are receiving this because you are subscribed to this thread.Message ID: @.***>