RagTag icon indicating copy to clipboard operation
RagTag copied to clipboard

Speed RagTag up a few orders of magnitude on fragmented assemblies with caching

Open twrightsman opened this issue 1 year ago • 1 comments

I noticed computing the line numbers of AGP files and objects was ~99% (!) of the runtime on our very fragmented contig assemblies.

This patch pre-computes the line numbers and dropped runtime from 16 hours to 8 minutes, the vast majority of which was spent computing the AGP file to write out.

I have checked this produces identical output on our data, but probably needs more extensive testing.

twrightsman avatar Feb 14 '24 14:02 twrightsman

This is amazing! @Mike would you be able to review and integrate?

Thanks so much for your contribution!

Mike

On Wed, Feb 14, 2024 at 9:51 AM Travis Wrightsman @.***> wrote:

I noticed computing the line numbers of AGP files and objects was ~99% (!) of the runtime on our very fragmented contig assemblies.

This patch pre-computes the line numbers and dropped runtime from 16 hours to 8 minutes, the vast majority of which was spent computing the AGP file to write out.

I have checked this produces identical output on our data, but probably needs more extensive testing.

You can view, comment on, or merge this pull request online at:

https://github.com/malonge/RagTag/pull/178 Commit Summary

File Changes

(1 file https://github.com/malonge/RagTag/pull/178/files)

Patch Links:

  • https://github.com/malonge/RagTag/pull/178.patch
  • https://github.com/malonge/RagTag/pull/178.diff

— Reply to this email directly, view it on GitHub https://github.com/malonge/RagTag/pull/178, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP347QJFQLRNG2S2SZWNLYTTFOTAVCNFSM6AAAAABDINRPG2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZTINJSGY3DONI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mschatz avatar Feb 18 '24 17:02 mschatz